Overview

Brought to you by YData

Dataset statistics

Number of variables93
Number of observations338440
Missing cells17578413
Missing cells (%)55.8%
Total size in memory240.1 MiB
Average record size in memory744.0 B

Variable types

Text93

Dataset

DescriptionNMNH Material Samples (USNM) 0049394-241126133413365
URLhttps://doi.org/10.15468/dl.ycwxgd

Alerts

institutionID has constant value "http://grbio.org/cool/142r-0w94" Constant
datasetName has constant value "NMNH Material Samples (USNM)" Constant
basisOfRecord has constant value "MaterialSample" Constant
occurrenceStatus has constant value "present" Constant
organismScope has constant value "963.0" Constant
eventTime has constant value "94648" Constant
eventRemarks has constant value "Guide to Best Practices for Georeferencing. (Chapman and Wieczorek, eds. 2006). Google Earth Pro" Constant
geologicalContextID has constant value "(Keferstein)" Constant
earliestEraOrLowestErathem has constant value "Chordata" Constant
catalogNumber has 70749 (20.9%) missing values Missing
recordNumber has 181774 (53.7%) missing values Missing
recordedBy has 70194 (20.7%) missing values Missing
individualCount has 39392 (11.6%) missing values Missing
sex has 176515 (52.2%) missing values Missing
lifeStage has 205051 (60.6%) missing values Missing
preparations has 251349 (74.3%) missing values Missing
associatedMedia has 323875 (95.7%) missing values Missing
associatedSequences has 305730 (90.3%) missing values Missing
occurrenceRemarks has 193737 (57.2%) missing values Missing
organismID has 338436 (> 99.9%) missing values Missing
organismName has 338438 (> 99.9%) missing values Missing
organismScope has 338439 (> 99.9%) missing values Missing
materialSampleID has 85078 (25.1%) missing values Missing
eventType has 338436 (> 99.9%) missing values Missing
fieldNumber has 267431 (79.0%) missing values Missing
eventDate has 16369 (4.8%) missing values Missing
eventTime has 338439 (> 99.9%) missing values Missing
startDayOfYear has 18131 (5.4%) missing values Missing
endDayOfYear has 17911 (5.3%) missing values Missing
year has 16370 (4.8%) missing values Missing
month has 17966 (5.3%) missing values Missing
day has 19384 (5.7%) missing values Missing
verbatimEventDate has 236098 (69.8%) missing values Missing
habitat has 302334 (89.3%) missing values Missing
eventRemarks has 338439 (> 99.9%) missing values Missing
locationID has 284922 (84.2%) missing values Missing
higherGeography has 4534 (1.3%) missing values Missing
continent has 144951 (42.8%) missing values Missing
waterBody has 231595 (68.4%) missing values Missing
islandGroup has 315692 (93.3%) missing values Missing
island has 279541 (82.6%) missing values Missing
country has 14430 (4.3%) missing values Missing
stateProvince has 66214 (19.6%) missing values Missing
county has 140615 (41.5%) missing values Missing
locality has 34082 (10.1%) missing values Missing
minimumElevationInMeters has 249251 (73.6%) missing values Missing
maximumElevationInMeters has 284628 (84.1%) missing values Missing
verbatimElevation has 322501 (95.3%) missing values Missing
minimumDepthInMeters has 264207 (78.1%) missing values Missing
maximumDepthInMeters has 271190 (80.1%) missing values Missing
verbatimDepth has 336961 (99.6%) missing values Missing
locationRemarks has 338438 (> 99.9%) missing values Missing
decimalLatitude has 73885 (21.8%) missing values Missing
decimalLongitude has 73885 (21.8%) missing values Missing
geodeticDatum has 308301 (91.1%) missing values Missing
coordinateUncertaintyInMeters has 327413 (96.7%) missing values Missing
coordinatePrecision has 338436 (> 99.9%) missing values Missing
verbatimCoordinates has 338436 (> 99.9%) missing values Missing
verbatimLatitude has 230082 (68.0%) missing values Missing
verbatimLongitude has 230109 (68.0%) missing values Missing
verbatimCoordinateSystem has 329369 (97.3%) missing values Missing
verbatimSRS has 338436 (> 99.9%) missing values Missing
footprintSpatialFit has 338436 (> 99.9%) missing values Missing
georeferencedBy has 338436 (> 99.9%) missing values Missing
georeferenceProtocol has 255527 (75.5%) missing values Missing
georeferenceRemarks has 328933 (97.2%) missing values Missing
geologicalContextID has 338439 (> 99.9%) missing values Missing
earliestEonOrLowestEonothem has 338436 (> 99.9%) missing values Missing
latestEonOrHighestEonothem has 338436 (> 99.9%) missing values Missing
earliestEraOrLowestErathem has 338438 (> 99.9%) missing values Missing
latestEraOrHighestErathem has 338436 (> 99.9%) missing values Missing
earliestPeriodOrLowestSystem has 338436 (> 99.9%) missing values Missing
earliestEpochOrLowestSeries has 338436 (> 99.9%) missing values Missing
lowestBiostratigraphicZone has 338436 (> 99.9%) missing values Missing
formation has 338436 (> 99.9%) missing values Missing
identificationQualifier has 333367 (98.5%) missing values Missing
typeStatus has 331835 (98.0%) missing values Missing
identifiedBy has 226287 (66.9%) missing values Missing
scientificName has 24062 (7.1%) missing values Missing
higherClassification has 5901 (1.7%) missing values Missing
kingdom has 10613 (3.1%) missing values Missing
phylum has 36740 (10.9%) missing values Missing
class has 12521 (3.7%) missing values Missing
order has 30431 (9.0%) missing values Missing
family has 18609 (5.5%) missing values Missing
genus has 25827 (7.6%) missing values Missing
subgenus has 336132 (99.3%) missing values Missing
specificEpithet has 33273 (9.8%) missing values Missing
infraspecificEpithet has 326664 (96.5%) missing values Missing
taxonRank has 326679 (96.5%) missing values Missing
scientificNameAuthorship has 174042 (51.4%) missing values Missing
gbifID has unique values Unique
occurrenceID has unique values Unique

Reproduction

Analysis started2025-01-14 16:34:21.771505
Analysis finished2025-01-14 16:34:33.689585
Duration11.92 seconds
Software versionydata-profiling vv4.12.1
Download configurationconfig.json

Variables

gbifID
Text

Unique 

Distinct338440
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size2.6 MiB
2025-01-14T11:34:33.997509image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters3384400
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique338440 ?
Unique (%)100.0%

Sample

1st row4501677301
2nd row3027962301
3rd row3028050301
4th row3027962302
5th row3028050302
ValueCountFrequency (%)
4501677301 1
 
< 0.1%
3028050302 1
 
< 0.1%
3041539301 1
 
< 0.1%
3357130301 1
 
< 0.1%
3027962303 1
 
< 0.1%
3758404301 1
 
< 0.1%
3027962304 1
 
< 0.1%
3336913301 1
 
< 0.1%
3028050303 1
 
< 0.1%
4909491307 1
 
< 0.1%
Other values (338430) 338430
> 99.9%
2025-01-14T11:34:34.427909image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 541548
16.0%
3 466717
13.8%
9 357521
10.6%
2 356979
10.5%
8 331351
9.8%
4 317675
9.4%
1 298477
8.8%
5 263706
7.8%
7 254468
7.5%
6 195958
 
5.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 3384400
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 541548
16.0%
3 466717
13.8%
9 357521
10.6%
2 356979
10.5%
8 331351
9.8%
4 317675
9.4%
1 298477
8.8%
5 263706
7.8%
7 254468
7.5%
6 195958
 
5.8%

Most occurring scripts

ValueCountFrequency (%)
Common 3384400
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 541548
16.0%
3 466717
13.8%
9 357521
10.6%
2 356979
10.5%
8 331351
9.8%
4 317675
9.4%
1 298477
8.8%
5 263706
7.8%
7 254468
7.5%
6 195958
 
5.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3384400
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 541548
16.0%
3 466717
13.8%
9 357521
10.6%
2 356979
10.5%
8 331351
9.8%
4 317675
9.4%
1 298477
8.8%
5 263706
7.8%
7 254468
7.5%
6 195958
 
5.8%
Distinct10795
Distinct (%)3.2%
Missing0
Missing (%)0.0%
Memory size2.6 MiB
2025-01-14T11:34:34.647453image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length19
Median length19
Mean length19
Min length19

Characters and Unicode

Total characters6430360
Distinct characters13
Distinct categories4 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2109 ?
Unique (%)0.6%

Sample

1st row2024-06-26 12:37:00
2nd row2021-10-14 09:12:00
3rd row2022-07-20 16:25:00
4th row2021-10-13 15:49:00
5th row2019-06-25 16:21:00
ValueCountFrequency (%)
2021-05-07 24979
 
3.7%
2024-09-05 21488
 
3.2%
2022-10-06 14730
 
2.2%
2021-10-14 13097
 
1.9%
2021-10-13 12997
 
1.9%
2024-01-01 11156
 
1.6%
2024-10-17 10237
 
1.5%
2017-12-07 9787
 
1.4%
2023-12-17 9667
 
1.4%
2022-07-20 9524
 
1.4%
Other values (1817) 539218
79.7%
2025-01-14T11:34:34.921786image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 1618219
25.2%
2 1049110
16.3%
1 803470
12.5%
- 676880
10.5%
: 676880
10.5%
338440
 
5.3%
3 230325
 
3.6%
4 213783
 
3.3%
5 202489
 
3.1%
7 196745
 
3.1%
Other values (3) 424019
 
6.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 4738160
73.7%
Dash Punctuation 676880
 
10.5%
Other Punctuation 676880
 
10.5%
Space Separator 338440
 
5.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 1618219
34.2%
2 1049110
22.1%
1 803470
17.0%
3 230325
 
4.9%
4 213783
 
4.5%
5 202489
 
4.3%
7 196745
 
4.2%
6 170871
 
3.6%
9 151998
 
3.2%
8 101150
 
2.1%
Dash Punctuation
ValueCountFrequency (%)
- 676880
100.0%
Other Punctuation
ValueCountFrequency (%)
: 676880
100.0%
Space Separator
ValueCountFrequency (%)
338440
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 6430360
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 1618219
25.2%
2 1049110
16.3%
1 803470
12.5%
- 676880
10.5%
: 676880
10.5%
338440
 
5.3%
3 230325
 
3.6%
4 213783
 
3.3%
5 202489
 
3.1%
7 196745
 
3.1%
Other values (3) 424019
 
6.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 6430360
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 1618219
25.2%
2 1049110
16.3%
1 803470
12.5%
- 676880
10.5%
: 676880
10.5%
338440
 
5.3%
3 230325
 
3.6%
4 213783
 
3.3%
5 202489
 
3.1%
7 196745
 
3.1%
Other values (3) 424019
 
6.6%

institutionID
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.6 MiB
2025-01-14T11:34:34.991680image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length31
Median length31
Mean length31
Min length31

Characters and Unicode

Total characters10491640
Distinct characters20
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowhttp://grbio.org/cool/142r-0w94
2nd rowhttp://grbio.org/cool/142r-0w94
3rd rowhttp://grbio.org/cool/142r-0w94
4th rowhttp://grbio.org/cool/142r-0w94
5th rowhttp://grbio.org/cool/142r-0w94
ValueCountFrequency (%)
http://grbio.org/cool/142r-0w94 338440
100.0%
2025-01-14T11:34:35.101303image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
/ 1353760
 
12.9%
o 1353760
 
12.9%
r 1015320
 
9.7%
g 676880
 
6.5%
t 676880
 
6.5%
4 676880
 
6.5%
h 338440
 
3.2%
1 338440
 
3.2%
w 338440
 
3.2%
0 338440
 
3.2%
Other values (10) 3384400
32.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 6091920
58.1%
Other Punctuation 2030640
 
19.4%
Decimal Number 2030640
 
19.4%
Dash Punctuation 338440
 
3.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o 1353760
22.2%
r 1015320
16.7%
g 676880
11.1%
t 676880
11.1%
h 338440
 
5.6%
w 338440
 
5.6%
l 338440
 
5.6%
c 338440
 
5.6%
i 338440
 
5.6%
b 338440
 
5.6%
Decimal Number
ValueCountFrequency (%)
4 676880
33.3%
1 338440
16.7%
0 338440
16.7%
2 338440
16.7%
9 338440
16.7%
Other Punctuation
ValueCountFrequency (%)
/ 1353760
66.7%
. 338440
 
16.7%
: 338440
 
16.7%
Dash Punctuation
ValueCountFrequency (%)
- 338440
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 6091920
58.1%
Common 4399720
41.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
o 1353760
22.2%
r 1015320
16.7%
g 676880
11.1%
t 676880
11.1%
h 338440
 
5.6%
w 338440
 
5.6%
l 338440
 
5.6%
c 338440
 
5.6%
i 338440
 
5.6%
b 338440
 
5.6%
Common
ValueCountFrequency (%)
/ 1353760
30.8%
4 676880
15.4%
1 338440
 
7.7%
0 338440
 
7.7%
- 338440
 
7.7%
2 338440
 
7.7%
. 338440
 
7.7%
: 338440
 
7.7%
9 338440
 
7.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 10491640
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
/ 1353760
 
12.9%
o 1353760
 
12.9%
r 1015320
 
9.7%
g 676880
 
6.5%
t 676880
 
6.5%
4 676880
 
6.5%
h 338440
 
3.2%
1 338440
 
3.2%
w 338440
 
3.2%
0 338440
 
3.2%
Other values (10) 3384400
32.3%
Distinct7
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.6 MiB
2025-01-14T11:34:35.164703image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length45
Median length45
Mean length45
Min length45

Characters and Unicode

Total characters15229800
Distinct characters22
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowurn:uuid:18e3cd08-a962-4f0a-b72c-9a0b3600c5ad
2nd rowurn:uuid:f14c21a9-8cbf-4c8b-817f-d19d427e2dd6
3rd rowurn:uuid:f14c21a9-8cbf-4c8b-817f-d19d427e2dd6
4th rowurn:uuid:f14c21a9-8cbf-4c8b-817f-d19d427e2dd6
5th rowurn:uuid:60e28f81-e634-4869-aa3e-732caed713c8
ValueCountFrequency (%)
urn:uuid:18e3cd08-a962-4f0a-b72c-9a0b3600c5ad 119162
35.2%
urn:uuid:f14c21a9-8cbf-4c8b-817f-d19d427e2dd6 74430
22.0%
urn:uuid:60e28f81-e634-4869-aa3e-732caed713c8 42294
 
12.5%
urn:uuid:09c9cf5f-f5d3-48cc-b5c8-cd9b9fbd631f 41606
 
12.3%
urn:uuid:cc104cbf-fd8e-4801-9b71-36731a7db1a0 28278
 
8.4%
urn:uuid:59e56a59-8615-4e0c-841d-eb88f3876b22 24507
 
7.2%
urn:uuid:73d83e23-1999-42cd-b38a-c06a7d32d893 8163
 
2.4%
2025-01-14T11:34:35.287699image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
- 1353760
 
8.9%
d 1155311
 
7.6%
c 1040667
 
6.8%
u 1015320
 
6.7%
8 917582
 
6.0%
0 797214
 
5.2%
a 775349
 
5.1%
1 741643
 
4.9%
9 705846
 
4.6%
: 676880
 
4.4%
Other values (12) 6050228
39.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 6802549
44.7%
Decimal Number 6396611
42.0%
Dash Punctuation 1353760
 
8.9%
Other Punctuation 676880
 
4.4%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
d 1155311
17.0%
c 1040667
15.3%
u 1015320
14.9%
a 775349
11.4%
f 673839
9.9%
b 654013
9.6%
e 472730
6.9%
i 338440
 
5.0%
r 338440
 
5.0%
n 338440
 
5.0%
Decimal Number
ValueCountFrequency (%)
8 917582
14.3%
0 797214
12.5%
1 741643
11.6%
9 705846
11.0%
3 620753
9.7%
2 619705
9.7%
6 591204
9.2%
4 582379
9.1%
7 478277
7.5%
5 342008
 
5.3%
Dash Punctuation
ValueCountFrequency (%)
- 1353760
100.0%
Other Punctuation
ValueCountFrequency (%)
: 676880
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 8427251
55.3%
Latin 6802549
44.7%

Most frequent character per script

Common
ValueCountFrequency (%)
- 1353760
16.1%
8 917582
10.9%
0 797214
9.5%
1 741643
8.8%
9 705846
8.4%
: 676880
8.0%
3 620753
7.4%
2 619705
7.4%
6 591204
7.0%
4 582379
6.9%
Other values (2) 820285
9.7%
Latin
ValueCountFrequency (%)
d 1155311
17.0%
c 1040667
15.3%
u 1015320
14.9%
a 775349
11.4%
f 673839
9.9%
b 654013
9.6%
e 472730
6.9%
i 338440
 
5.0%
r 338440
 
5.0%
n 338440
 
5.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 15229800
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
- 1353760
 
8.9%
d 1155311
 
7.6%
c 1040667
 
6.8%
u 1015320
 
6.7%
8 917582
 
6.0%
0 797214
 
5.2%
a 775349
 
5.1%
1 741643
 
4.9%
9 705846
 
4.6%
: 676880
 
4.4%
Other values (12) 6050228
39.7%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.6 MiB
2025-01-14T11:34:35.330255image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length4
Median length4
Mean length3.750065004
Min length2

Characters and Unicode

Total characters1269172
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowUSNM
2nd rowUSNM
3rd rowUSNM
4th rowUSNM
5th rowUS
ValueCountFrequency (%)
usnm 296146
87.5%
us 42294
 
12.5%
2025-01-14T11:34:35.438301image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
U 338440
26.7%
S 338440
26.7%
N 296146
23.3%
M 296146
23.3%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 1269172
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
U 338440
26.7%
S 338440
26.7%
N 296146
23.3%
M 296146
23.3%

Most occurring scripts

ValueCountFrequency (%)
Latin 1269172
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
U 338440
26.7%
S 338440
26.7%
N 296146
23.3%
M 296146
23.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1269172
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
U 338440
26.7%
S 338440
26.7%
N 296146
23.3%
M 296146
23.3%
Distinct7
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.6 MiB
2025-01-14T11:34:35.486149image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length5
Median length4
Mean length2.982250916
Min length2

Characters and Unicode

Total characters1009313
Distinct characters15
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowENT
2nd rowIZ
3rd rowIZ
4th rowIZ
5th rowUS
ValueCountFrequency (%)
ent 119162
35.2%
iz 74430
22.0%
us 42294
 
12.5%
fish 41606
 
12.3%
herp 28278
 
8.4%
mamm 24507
 
7.2%
birds 8163
 
2.4%
2025-01-14T11:34:35.599868image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
E 147440
14.6%
I 124199
12.3%
N 119162
11.8%
T 119162
11.8%
S 92063
9.1%
Z 74430
7.4%
M 73521
7.3%
H 69884
6.9%
U 42294
 
4.2%
F 41606
 
4.1%
Other values (5) 105552
10.5%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 1009313
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
E 147440
14.6%
I 124199
12.3%
N 119162
11.8%
T 119162
11.8%
S 92063
9.1%
Z 74430
7.4%
M 73521
7.3%
H 69884
6.9%
U 42294
 
4.2%
F 41606
 
4.1%
Other values (5) 105552
10.5%

Most occurring scripts

ValueCountFrequency (%)
Latin 1009313
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
E 147440
14.6%
I 124199
12.3%
N 119162
11.8%
T 119162
11.8%
S 92063
9.1%
Z 74430
7.4%
M 73521
7.3%
H 69884
6.9%
U 42294
 
4.2%
F 41606
 
4.1%
Other values (5) 105552
10.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1009313
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
E 147440
14.6%
I 124199
12.3%
N 119162
11.8%
T 119162
11.8%
S 92063
9.1%
Z 74430
7.4%
M 73521
7.3%
H 69884
6.9%
U 42294
 
4.2%
F 41606
 
4.1%
Other values (5) 105552
10.5%

datasetName
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.6 MiB
2025-01-14T11:34:35.646826image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length28
Median length28
Mean length28
Min length28

Characters and Unicode

Total characters9476320
Distinct characters17
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNMNH Material Samples (USNM)
2nd rowNMNH Material Samples (USNM)
3rd rowNMNH Material Samples (USNM)
4th rowNMNH Material Samples (USNM)
5th rowNMNH Material Samples (USNM)
ValueCountFrequency (%)
nmnh 338440
25.0%
material 338440
25.0%
samples 338440
25.0%
usnm 338440
25.0%
2025-01-14T11:34:35.746017image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
N 1015320
10.7%
1015320
10.7%
a 1015320
10.7%
M 1015320
10.7%
e 676880
 
7.1%
l 676880
 
7.1%
S 676880
 
7.1%
p 338440
 
3.6%
U 338440
 
3.6%
( 338440
 
3.6%
Other values (7) 2369080
25.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 4399720
46.4%
Uppercase Letter 3384400
35.7%
Space Separator 1015320
 
10.7%
Open Punctuation 338440
 
3.6%
Close Punctuation 338440
 
3.6%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 1015320
23.1%
e 676880
15.4%
l 676880
15.4%
p 338440
 
7.7%
s 338440
 
7.7%
i 338440
 
7.7%
m 338440
 
7.7%
r 338440
 
7.7%
t 338440
 
7.7%
Uppercase Letter
ValueCountFrequency (%)
N 1015320
30.0%
M 1015320
30.0%
S 676880
20.0%
U 338440
 
10.0%
H 338440
 
10.0%
Space Separator
ValueCountFrequency (%)
1015320
100.0%
Open Punctuation
ValueCountFrequency (%)
( 338440
100.0%
Close Punctuation
ValueCountFrequency (%)
) 338440
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 7784120
82.1%
Common 1692200
 
17.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
N 1015320
13.0%
a 1015320
13.0%
M 1015320
13.0%
e 676880
8.7%
l 676880
8.7%
S 676880
8.7%
p 338440
 
4.3%
U 338440
 
4.3%
s 338440
 
4.3%
i 338440
 
4.3%
Other values (4) 1353760
17.4%
Common
ValueCountFrequency (%)
1015320
60.0%
( 338440
 
20.0%
) 338440
 
20.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 9476320
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
N 1015320
10.7%
1015320
10.7%
a 1015320
10.7%
M 1015320
10.7%
e 676880
 
7.1%
l 676880
 
7.1%
S 676880
 
7.1%
p 338440
 
3.6%
U 338440
 
3.6%
( 338440
 
3.6%
Other values (7) 2369080
25.0%

basisOfRecord
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.6 MiB
2025-01-14T11:34:35.795535image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length14
Median length14
Mean length14
Min length14

Characters and Unicode

Total characters4738160
Distinct characters10
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowMaterialSample
2nd rowMaterialSample
3rd rowMaterialSample
4th rowMaterialSample
5th rowMaterialSample
ValueCountFrequency (%)
materialsample 338440
100.0%
2025-01-14T11:34:35.898612image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 1015320
21.4%
e 676880
14.3%
l 676880
14.3%
M 338440
 
7.1%
t 338440
 
7.1%
r 338440
 
7.1%
i 338440
 
7.1%
S 338440
 
7.1%
m 338440
 
7.1%
p 338440
 
7.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 4061280
85.7%
Uppercase Letter 676880
 
14.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 1015320
25.0%
e 676880
16.7%
l 676880
16.7%
t 338440
 
8.3%
r 338440
 
8.3%
i 338440
 
8.3%
m 338440
 
8.3%
p 338440
 
8.3%
Uppercase Letter
ValueCountFrequency (%)
M 338440
50.0%
S 338440
50.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 4738160
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 1015320
21.4%
e 676880
14.3%
l 676880
14.3%
M 338440
 
7.1%
t 338440
 
7.1%
r 338440
 
7.1%
i 338440
 
7.1%
S 338440
 
7.1%
m 338440
 
7.1%
p 338440
 
7.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4738160
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 1015320
21.4%
e 676880
14.3%
l 676880
14.3%
M 338440
 
7.1%
t 338440
 
7.1%
r 338440
 
7.1%
i 338440
 
7.1%
S 338440
 
7.1%
m 338440
 
7.1%
p 338440
 
7.1%

occurrenceID
Text

Unique 

Distinct338440
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size2.6 MiB
2025-01-14T11:34:36.134713image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length63
Median length63
Mean length63
Min length63

Characters and Unicode

Total characters21321720
Distinct characters26
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique338440 ?
Unique (%)100.0%

Sample

1st rowhttp://n2t.net/ark:/65665/300028c5f-ea1d-4c01-9253-09524fc57db6
2nd rowhttp://n2t.net/ark:/65665/30006cd83-36b3-4629-86db-f5a28307189f
3rd rowhttp://n2t.net/ark:/65665/30007a443-7a0a-49a9-9c54-cae1342160a6
4th rowhttp://n2t.net/ark:/65665/300098b69-426b-451c-a675-27a1b7bb5b60
5th rowhttp://n2t.net/ark:/65665/3000a9424-501b-43e7-a337-ee632a8fa9d0
ValueCountFrequency (%)
http://n2t.net/ark:/65665/300028c5f-ea1d-4c01-9253-09524fc57db6 1
 
< 0.1%
http://n2t.net/ark:/65665/3000a9424-501b-43e7-a337-ee632a8fa9d0 1
 
< 0.1%
http://n2t.net/ark:/65665/3000ff086-55d6-4f50-81a9-fc07e565e180 1
 
< 0.1%
http://n2t.net/ark:/65665/300114e18-4d31-4558-acc1-47ce8dd8940c 1
 
< 0.1%
http://n2t.net/ark:/65665/300119514-9afd-4342-83ae-3526ac40f20f 1
 
< 0.1%
http://n2t.net/ark:/65665/300154f73-1f7a-4d73-8c43-7c6d66c03b0f 1
 
< 0.1%
http://n2t.net/ark:/65665/30015c5b5-263e-4d28-916f-89728207dfda 1
 
< 0.1%
http://n2t.net/ark:/65665/3001878d3-3d26-4b66-9ad5-77d6938de137 1
 
< 0.1%
http://n2t.net/ark:/65665/300187c30-1f5e-4401-a208-4e42206dc341 1
 
< 0.1%
http://n2t.net/ark:/65665/300193d42-6a2a-41b9-b203-29e571953cd6 1
 
< 0.1%
Other values (338430) 338430
> 99.9%
2025-01-14T11:34:36.460563image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
/ 1692200
 
7.9%
6 1649636
 
7.7%
- 1353760
 
6.3%
t 1353760
 
6.3%
5 1311144
 
6.1%
a 1057250
 
5.0%
4 973748
 
4.6%
3 973420
 
4.6%
2 972810
 
4.6%
e 971609
 
4.6%
Other values (16) 9012383
42.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 9223035
43.3%
Lowercase Letter 8037405
37.7%
Other Punctuation 2707520
 
12.7%
Dash Punctuation 1353760
 
6.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t 1353760
16.8%
a 1057250
13.2%
e 971609
12.1%
b 718766
8.9%
n 676880
8.4%
f 635572
7.9%
c 635398
7.9%
d 634410
7.9%
k 338440
 
4.2%
r 338440
 
4.2%
Other values (2) 676880
8.4%
Decimal Number
ValueCountFrequency (%)
6 1649636
17.9%
5 1311144
14.2%
4 973748
10.6%
3 973420
10.6%
2 972810
10.5%
9 719712
7.8%
8 718307
7.8%
0 635068
 
6.9%
1 634600
 
6.9%
7 634590
 
6.9%
Other Punctuation
ValueCountFrequency (%)
/ 1692200
62.5%
: 676880
 
25.0%
. 338440
 
12.5%
Dash Punctuation
ValueCountFrequency (%)
- 1353760
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 13284315
62.3%
Latin 8037405
37.7%

Most frequent character per script

Common
ValueCountFrequency (%)
/ 1692200
12.7%
6 1649636
12.4%
- 1353760
10.2%
5 1311144
9.9%
4 973748
7.3%
3 973420
7.3%
2 972810
7.3%
9 719712
 
5.4%
8 718307
 
5.4%
: 676880
 
5.1%
Other values (4) 2242698
16.9%
Latin
ValueCountFrequency (%)
t 1353760
16.8%
a 1057250
13.2%
e 971609
12.1%
b 718766
8.9%
n 676880
8.4%
f 635572
7.9%
c 635398
7.9%
d 634410
7.9%
k 338440
 
4.2%
r 338440
 
4.2%
Other values (2) 676880
8.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 21321720
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
/ 1692200
 
7.9%
6 1649636
 
7.7%
- 1353760
 
6.3%
t 1353760
 
6.3%
5 1311144
 
6.1%
a 1057250
 
5.0%
4 973748
 
4.6%
3 973420
 
4.6%
2 972810
 
4.6%
e 971609
 
4.6%
Other values (16) 9012383
42.3%

catalogNumber
Text

Missing 

Distinct226029
Distinct (%)84.4%
Missing70749
Missing (%)20.9%
Memory size2.6 MiB
2025-01-14T11:34:36.736225image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length21
Median length20
Mean length14.08593117
Min length9

Characters and Unicode

Total characters3770677
Distinct characters36
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique192923 ?
Unique (%)72.1%

Sample

1st rowUSNMENT00976719.2
2nd rowUSNM 1566725
3rd rowUSNM 1430312
4th rowUSNM 1477111
5th rowUSNMENT01646520
ValueCountFrequency (%)
usnm 146337
33.4%
herp 7481
 
1.7%
tissue 7190
 
1.6%
us 2194
 
0.5%
wet 2190
 
0.5%
lot 2190
 
0.5%
2190
 
0.5%
image 291
 
0.1%
594492 64
 
< 0.1%
1487948 58
 
< 0.1%
Other values (223627) 267569
61.1%
2025-01-14T11:34:37.069149image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
N 384657
 
10.2%
1 339055
 
9.0%
0 282387
 
7.5%
S 267692
 
7.1%
U 267691
 
7.1%
M 265497
 
7.0%
4 250940
 
6.7%
6 201528
 
5.3%
3 187578
 
5.0%
2 175087
 
4.6%
Other values (26) 1148565
30.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1991709
52.8%
Uppercase Letter 1438835
38.2%
Space Separator 170063
 
4.5%
Other Punctuation 95183
 
2.5%
Lowercase Letter 72697
 
1.9%
Dash Punctuation 2190
 
0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 17152
23.6%
s 14380
19.8%
p 7481
10.3%
r 7481
10.3%
i 7190
9.9%
u 7190
9.9%
t 4380
 
6.0%
w 2190
 
3.0%
l 2190
 
3.0%
o 2190
 
3.0%
Other values (3) 873
 
1.2%
Uppercase Letter
ValueCountFrequency (%)
N 384657
26.7%
S 267692
18.6%
U 267691
18.6%
M 265497
18.5%
T 126351
 
8.8%
E 119160
 
8.3%
H 7481
 
0.5%
I 291
 
< 0.1%
A 14
 
< 0.1%
R 1
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
1 339055
17.0%
0 282387
14.2%
4 250940
12.6%
6 201528
10.1%
3 187578
9.4%
2 175087
8.8%
5 167513
8.4%
9 130924
 
6.6%
7 128542
 
6.5%
8 128155
 
6.4%
Space Separator
ValueCountFrequency (%)
170063
100.0%
Other Punctuation
ValueCountFrequency (%)
. 95183
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 2190
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 2259145
59.9%
Latin 1511532
40.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
N 384657
25.4%
S 267692
17.7%
U 267691
17.7%
M 265497
17.6%
T 126351
 
8.4%
E 119160
 
7.9%
e 17152
 
1.1%
s 14380
 
1.0%
p 7481
 
0.5%
r 7481
 
0.5%
Other values (13) 33990
 
2.2%
Common
ValueCountFrequency (%)
1 339055
15.0%
0 282387
12.5%
4 250940
11.1%
6 201528
8.9%
3 187578
8.3%
2 175087
7.8%
170063
7.5%
5 167513
7.4%
9 130924
 
5.8%
7 128542
 
5.7%
Other values (3) 225528
10.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3770677
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
N 384657
 
10.2%
1 339055
 
9.0%
0 282387
 
7.5%
S 267692
 
7.1%
U 267691
 
7.1%
M 265497
 
7.0%
4 250940
 
6.7%
6 201528
 
5.3%
3 187578
 
5.0%
2 175087
 
4.6%
Other values (26) 1148565
30.5%

recordNumber
Text

Missing 

Distinct103014
Distinct (%)65.8%
Missing181774
Missing (%)53.7%
Memory size2.6 MiB
2025-01-14T11:34:37.272675image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length87
Median length53
Mean length8.258850038
Min length1

Characters and Unicode

Total characters1293881
Distinct characters76
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique67354 ?
Unique (%)43.0%

Sample

1st rowT548-A9-TW19
2nd rowBMOO-09792
3rd rowJC3629
4th row707
5th rowmbio988
ValueCountFrequency (%)
blz 5369
 
2.8%
d&ml 4442
 
2.4%
1572
 
0.8%
tag 1342
 
0.7%
tree 1342
 
0.7%
flmoo 1323
 
0.7%
blb 1220
 
0.6%
sms 1216
 
0.6%
bah 991
 
0.5%
tob 838
 
0.4%
Other values (93558) 168768
89.6%
2025-01-14T11:34:37.542777image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 122839
 
9.5%
2 92741
 
7.2%
0 89293
 
6.9%
3 72397
 
5.6%
- 60896
 
4.7%
5 57984
 
4.5%
4 57632
 
4.5%
6 53757
 
4.2%
8 52829
 
4.1%
7 52283
 
4.0%
Other values (66) 581230
44.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 701841
54.2%
Uppercase Letter 422285
32.6%
Dash Punctuation 60911
 
4.7%
Lowercase Letter 41313
 
3.2%
Space Separator 31757
 
2.5%
Connector Punctuation 19948
 
1.5%
Other Punctuation 11784
 
0.9%
Close Punctuation 2021
 
0.2%
Open Punctuation 2021
 
0.2%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
C 37587
 
8.9%
B 37015
 
8.8%
O 31746
 
7.5%
M 31446
 
7.4%
S 27844
 
6.6%
A 26271
 
6.2%
R 24686
 
5.8%
T 22639
 
5.4%
L 20614
 
4.9%
E 18908
 
4.5%
Other values (16) 143529
34.0%
Lowercase Letter
ValueCountFrequency (%)
e 5200
12.6%
i 4273
10.3%
a 4129
10.0%
b 4010
9.7%
o 4008
9.7%
r 3394
8.2%
m 3338
8.1%
l 2849
6.9%
s 1665
 
4.0%
v 1562
 
3.8%
Other values (15) 6885
16.7%
Decimal Number
ValueCountFrequency (%)
1 122839
17.5%
2 92741
13.2%
0 89293
12.7%
3 72397
10.3%
5 57984
8.3%
4 57632
8.2%
6 53757
7.7%
8 52829
7.5%
7 52283
7.4%
9 50086
7.1%
Other Punctuation
ValueCountFrequency (%)
, 4691
39.8%
& 4584
38.9%
# 1514
 
12.8%
. 921
 
7.8%
/ 49
 
0.4%
? 22
 
0.2%
: 3
 
< 0.1%
Dash Punctuation
ValueCountFrequency (%)
- 60896
> 99.9%
15
 
< 0.1%
Close Punctuation
ValueCountFrequency (%)
) 2007
99.3%
] 14
 
0.7%
Open Punctuation
ValueCountFrequency (%)
( 2007
99.3%
[ 14
 
0.7%
Space Separator
ValueCountFrequency (%)
31757
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 19948
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 830283
64.2%
Latin 463598
35.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
C 37587
 
8.1%
B 37015
 
8.0%
O 31746
 
6.8%
M 31446
 
6.8%
S 27844
 
6.0%
A 26271
 
5.7%
R 24686
 
5.3%
T 22639
 
4.9%
L 20614
 
4.4%
E 18908
 
4.1%
Other values (41) 184842
39.9%
Common
ValueCountFrequency (%)
1 122839
14.8%
2 92741
11.2%
0 89293
10.8%
3 72397
8.7%
- 60896
7.3%
5 57984
7.0%
4 57632
6.9%
6 53757
6.5%
8 52829
6.4%
7 52283
6.3%
Other values (15) 117632
14.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1293866
> 99.9%
Punctuation 15
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 122839
 
9.5%
2 92741
 
7.2%
0 89293
 
6.9%
3 72397
 
5.6%
- 60896
 
4.7%
5 57984
 
4.5%
4 57632
 
4.5%
6 53757
 
4.2%
8 52829
 
4.1%
7 52283
 
4.0%
Other values (65) 581215
44.9%
Punctuation
ValueCountFrequency (%)
15
100.0%

recordedBy
Text

Missing 

Distinct8091
Distinct (%)3.0%
Missing70194
Missing (%)20.7%
Memory size2.6 MiB
2025-01-14T11:34:37.727849image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length161
Median length107
Mean length24.1533555
Min length1

Characters and Unicode

Total characters6479041
Distinct characters83
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique911 ?
Unique (%)0.3%

Sample

1st rowR. Wielgus
2nd rowR. Vrijenhoek
3rd rowS. McPherson
4th rowK. Crandall, H. Robinson, J. Buhay & A. Toon
5th rowTibet-MacArthur, D. A. Bell, V. A. Funk, S. Ge, Y. Meng, Z. Nie, R. Ree, J. Wen, J. Yue & W. Zuo
ValueCountFrequency (%)
115581
 
8.9%
m 71033
 
5.5%
j 69003
 
5.3%
r 47243
 
3.6%
d 44057
 
3.4%
c 43631
 
3.4%
s 40837
 
3.1%
k 35450
 
2.7%
l 29158
 
2.2%
a 28418
 
2.2%
Other values (5514) 776756
59.7%
2025-01-14T11:34:37.998178image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1032921
15.9%
. 565462
 
8.7%
e 432466
 
6.7%
a 360216
 
5.6%
n 295808
 
4.6%
r 285836
 
4.4%
i 278879
 
4.3%
l 261365
 
4.0%
o 259195
 
4.0%
t 196007
 
3.0%
Other values (73) 2510886
38.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 3378651
52.1%
Uppercase Letter 1203503
 
18.6%
Space Separator 1032921
 
15.9%
Other Punctuation 838894
 
12.9%
Dash Punctuation 13931
 
0.2%
Decimal Number 8809
 
0.1%
Close Punctuation 1221
 
< 0.1%
Open Punctuation 1111
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 432466
12.8%
a 360216
10.7%
n 295808
8.8%
r 285836
8.5%
i 278879
 
8.3%
l 261365
 
7.7%
o 259195
 
7.7%
t 196007
 
5.8%
s 189168
 
5.6%
u 120911
 
3.6%
Other values (27) 698800
20.7%
Uppercase Letter
ValueCountFrequency (%)
M 124673
 
10.4%
S 91501
 
7.6%
C 84918
 
7.1%
B 82920
 
6.9%
R 80213
 
6.7%
J 77130
 
6.4%
P 76391
 
6.3%
D 68133
 
5.7%
L 65369
 
5.4%
W 57293
 
4.8%
Other values (17) 394962
32.8%
Decimal Number
ValueCountFrequency (%)
9 2232
25.3%
1 2078
23.6%
2 2016
22.9%
0 1932
21.9%
8 370
 
4.2%
6 95
 
1.1%
4 84
 
1.0%
3 2
 
< 0.1%
Other Punctuation
ValueCountFrequency (%)
. 565462
67.4%
, 155134
 
18.5%
& 115577
 
13.8%
/ 2047
 
0.2%
' 674
 
0.1%
Close Punctuation
ValueCountFrequency (%)
) 1011
82.8%
] 210
 
17.2%
Open Punctuation
ValueCountFrequency (%)
( 901
81.1%
[ 210
 
18.9%
Space Separator
ValueCountFrequency (%)
1032921
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 13931
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 4582154
70.7%
Common 1896887
29.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 432466
 
9.4%
a 360216
 
7.9%
n 295808
 
6.5%
r 285836
 
6.2%
i 278879
 
6.1%
l 261365
 
5.7%
o 259195
 
5.7%
t 196007
 
4.3%
s 189168
 
4.1%
M 124673
 
2.7%
Other values (54) 1898541
41.4%
Common
ValueCountFrequency (%)
1032921
54.5%
. 565462
29.8%
, 155134
 
8.2%
& 115577
 
6.1%
- 13931
 
0.7%
9 2232
 
0.1%
1 2078
 
0.1%
/ 2047
 
0.1%
2 2016
 
0.1%
0 1932
 
0.1%
Other values (9) 3557
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 6477050
> 99.9%
None 1991
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1032921
15.9%
. 565462
 
8.7%
e 432466
 
6.7%
a 360216
 
5.6%
n 295808
 
4.6%
r 285836
 
4.4%
i 278879
 
4.3%
l 261365
 
4.0%
o 259195
 
4.0%
t 196007
 
3.0%
Other values (61) 2508895
38.7%
None
ValueCountFrequency (%)
í 1006
50.5%
é 487
24.5%
ö 157
 
7.9%
á 138
 
6.9%
ó 97
 
4.9%
Ç 33
 
1.7%
ı 33
 
1.7%
ñ 21
 
1.1%
ú 12
 
0.6%
ü 3
 
0.2%
Other values (2) 4
 
0.2%

individualCount
Text

Missing 

Distinct19
Distinct (%)< 0.1%
Missing39392
Missing (%)11.6%
Memory size2.6 MiB
2025-01-14T11:34:38.058553image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length2
Median length1
Mean length1.00012707
Min length1

Characters and Unicode

Total characters299086
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique6 ?
Unique (%)< 0.1%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1
ValueCountFrequency (%)
1 295008
98.6%
0 2661
 
0.9%
4 440
 
0.1%
2 364
 
0.1%
5 280
 
0.1%
3 226
 
0.1%
10 26
 
< 0.1%
6 20
 
< 0.1%
8 5
 
< 0.1%
7 4
 
< 0.1%
Other values (9) 14
 
< 0.1%
2025-01-14T11:34:38.163944image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 295042
98.6%
0 2691
 
0.9%
4 442
 
0.1%
2 369
 
0.1%
5 280
 
0.1%
3 229
 
0.1%
6 21
 
< 0.1%
8 5
 
< 0.1%
7 4
 
< 0.1%
9 3
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 299086
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 295042
98.6%
0 2691
 
0.9%
4 442
 
0.1%
2 369
 
0.1%
5 280
 
0.1%
3 229
 
0.1%
6 21
 
< 0.1%
8 5
 
< 0.1%
7 4
 
< 0.1%
9 3
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Common 299086
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 295042
98.6%
0 2691
 
0.9%
4 442
 
0.1%
2 369
 
0.1%
5 280
 
0.1%
3 229
 
0.1%
6 21
 
< 0.1%
8 5
 
< 0.1%
7 4
 
< 0.1%
9 3
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 299086
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 295042
98.6%
0 2691
 
0.9%
4 442
 
0.1%
2 369
 
0.1%
5 280
 
0.1%
3 229
 
0.1%
6 21
 
< 0.1%
8 5
 
< 0.1%
7 4
 
< 0.1%
9 3
 
< 0.1%

sex
Text

Missing 

Distinct58
Distinct (%)< 0.1%
Missing176515
Missing (%)52.2%
Memory size2.6 MiB
2025-01-14T11:34:38.219482image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length40
Median length7
Mean length6.105907056
Min length4

Characters and Unicode

Total characters988699
Distinct characters28
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique12 ?
Unique (%)< 0.1%

Sample

1st rowUnknown
2nd rowUnknown
3rd rowUnknown
4th rowMale
5th rowUnknown
ValueCountFrequency (%)
unknown 88174
53.9%
male 41609
25.4%
female 32812
 
20.1%
worker 489
 
0.3%
sex 178
 
0.1%
hermaphrodite 73
 
< 0.1%
62
 
< 0.1%
unable 24
 
< 0.1%
to 24
 
< 0.1%
determine 24
 
< 0.1%
Other values (6) 57
 
< 0.1%
2025-01-14T11:34:38.349973image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
n 264592
26.8%
e 108181
10.9%
o 88771
 
9.0%
k 88663
 
9.0%
w 88663
 
9.0%
U 87723
 
8.9%
a 74566
 
7.5%
l 74493
 
7.5%
m 37336
 
3.8%
M 37219
 
3.8%
Other values (18) 38492
 
3.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 831664
84.1%
Uppercase Letter 154048
 
15.6%
Space Separator 1601
 
0.2%
Other Punctuation 1386
 
0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
n 264592
31.8%
e 108181
13.0%
o 88771
 
10.7%
k 88663
 
10.7%
w 88663
 
10.7%
a 74566
 
9.0%
l 74493
 
9.0%
m 37336
 
4.5%
f 3897
 
0.5%
r 1159
 
0.1%
Other values (9) 1343
 
0.2%
Uppercase Letter
ValueCountFrequency (%)
U 87723
56.9%
M 37219
24.2%
F 28928
 
18.8%
S 167
 
0.1%
P 11
 
< 0.1%
Other Punctuation
ValueCountFrequency (%)
; 1380
99.6%
? 4
 
0.3%
/ 2
 
0.1%
Space Separator
ValueCountFrequency (%)
1601
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 985712
99.7%
Common 2987
 
0.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
n 264592
26.8%
e 108181
11.0%
o 88771
 
9.0%
k 88663
 
9.0%
w 88663
 
9.0%
U 87723
 
8.9%
a 74566
 
7.6%
l 74493
 
7.6%
m 37336
 
3.8%
M 37219
 
3.8%
Other values (14) 35505
 
3.6%
Common
ValueCountFrequency (%)
1601
53.6%
; 1380
46.2%
? 4
 
0.1%
/ 2
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 988699
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
n 264592
26.8%
e 108181
10.9%
o 88771
 
9.0%
k 88663
 
9.0%
w 88663
 
9.0%
U 87723
 
8.9%
a 74566
 
7.5%
l 74493
 
7.5%
m 37336
 
3.8%
M 37219
 
3.8%
Other values (18) 38492
 
3.9%

lifeStage
Text

Missing 

Distinct170
Distinct (%)0.1%
Missing205051
Missing (%)60.6%
Memory size2.6 MiB
2025-01-14T11:34:38.449552image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length37
Median length5
Mean length5.180614593
Min length1

Characters and Unicode

Total characters691037
Distinct characters52
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique33 ?
Unique (%)< 0.1%

Sample

1st rowAdult
2nd rowAdult
3rd rowAdult
4th rowAdult
5th rowAdult
ValueCountFrequency (%)
adult 122000
90.4%
juvenile 3367
 
2.5%
larva 1575
 
1.2%
ii 1499
 
1.1%
flowering 1064
 
0.8%
i 883
 
0.7%
unknown 548
 
0.4%
subadult 538
 
0.4%
sterile 365
 
0.3%
eft 308
 
0.2%
Other values (90) 2764
 
2.0%
2025-01-14T11:34:38.621733image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
l 129264
18.7%
u 127768
18.5%
t 124368
18.0%
d 122898
17.8%
A 121282
17.6%
e 10362
 
1.5%
n 6940
 
1.0%
a 6310
 
0.9%
i 5972
 
0.9%
v 5446
 
0.8%
Other values (42) 30427
 
4.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 556509
80.5%
Uppercase Letter 131353
 
19.0%
Other Punctuation 1566
 
0.2%
Space Separator 1522
 
0.2%
Dash Punctuation 65
 
< 0.1%
Open Punctuation 11
 
< 0.1%
Close Punctuation 11
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
l 129264
23.2%
u 127768
23.0%
t 124368
22.3%
d 122898
22.1%
e 10362
 
1.9%
n 6940
 
1.2%
a 6310
 
1.1%
i 5972
 
1.1%
v 5446
 
1.0%
r 4535
 
0.8%
Other values (15) 12646
 
2.3%
Uppercase Letter
ValueCountFrequency (%)
A 121282
92.3%
I 4215
 
3.2%
J 1708
 
1.3%
F 1155
 
0.9%
S 895
 
0.7%
U 548
 
0.4%
L 513
 
0.4%
E 357
 
0.3%
P 245
 
0.2%
N 128
 
0.1%
Other values (9) 307
 
0.2%
Other Punctuation
ValueCountFrequency (%)
; 1498
95.7%
? 37
 
2.4%
' 28
 
1.8%
/ 3
 
0.2%
Space Separator
ValueCountFrequency (%)
1522
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 65
100.0%
Open Punctuation
ValueCountFrequency (%)
( 11
100.0%
Close Punctuation
ValueCountFrequency (%)
) 11
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 687862
99.5%
Common 3175
 
0.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
l 129264
18.8%
u 127768
18.6%
t 124368
18.1%
d 122898
17.9%
A 121282
17.6%
e 10362
 
1.5%
n 6940
 
1.0%
a 6310
 
0.9%
i 5972
 
0.9%
v 5446
 
0.8%
Other values (34) 27252
 
4.0%
Common
ValueCountFrequency (%)
1522
47.9%
; 1498
47.2%
- 65
 
2.0%
? 37
 
1.2%
' 28
 
0.9%
( 11
 
0.3%
) 11
 
0.3%
/ 3
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 691009
> 99.9%
None 28
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
l 129264
18.7%
u 127768
18.5%
t 124368
18.0%
d 122898
17.8%
A 121282
17.6%
e 10362
 
1.5%
n 6940
 
1.0%
a 6310
 
0.9%
i 5972
 
0.9%
v 5446
 
0.8%
Other values (41) 30399
 
4.4%
None
ValueCountFrequency (%)
ü 28
100.0%

occurrenceStatus
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.6 MiB
2025-01-14T11:34:38.668530image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length7
Median length7
Mean length7
Min length7

Characters and Unicode

Total characters2369080
Distinct characters6
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowpresent
2nd rowpresent
3rd rowpresent
4th rowpresent
5th rowpresent
ValueCountFrequency (%)
present 338440
100.0%
2025-01-14T11:34:38.763466image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 676880
28.6%
p 338440
14.3%
r 338440
14.3%
s 338440
14.3%
n 338440
14.3%
t 338440
14.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2369080
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 676880
28.6%
p 338440
14.3%
r 338440
14.3%
s 338440
14.3%
n 338440
14.3%
t 338440
14.3%

Most occurring scripts

ValueCountFrequency (%)
Latin 2369080
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 676880
28.6%
p 338440
14.3%
r 338440
14.3%
s 338440
14.3%
n 338440
14.3%
t 338440
14.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2369080
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 676880
28.6%
p 338440
14.3%
r 338440
14.3%
s 338440
14.3%
n 338440
14.3%
t 338440
14.3%

preparations
Text

Missing 

Distinct34
Distinct (%)< 0.1%
Missing251349
Missing (%)74.3%
Memory size2.6 MiB
2025-01-14T11:34:38.812987image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length142
Median length6
Mean length6.192109403
Min length4

Characters and Unicode

Total characters539277
Distinct characters47
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5 ?
Unique (%)< 0.1%

Sample

1st rowFrozen
2nd rowFrozen
3rd rowFrozen
4th rowFrozen
5th rowFrozen
ValueCountFrequency (%)
frozen 72657
79.3%
vial 6702
 
7.3%
ethanol 4922
 
5.4%
wet 2271
 
2.5%
lot 2271
 
2.5%
drained 1063
 
1.2%
photograph 626
 
0.7%
biorepository 456
 
0.5%
alcohol 198
 
0.2%
148
 
0.2%
Other values (11) 295
 
0.3%
2025-01-14T11:34:38.934228image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
o 83011
15.4%
n 78739
14.6%
e 76503
14.2%
r 75316
14.0%
z 72657
13.5%
F 72306
13.4%
l 14346
 
2.7%
a 13325
 
2.5%
t 10601
 
2.0%
i 8777
 
1.6%
Other values (37) 33696
6.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 448059
83.1%
Uppercase Letter 84989
 
15.8%
Space Separator 4518
 
0.8%
Other Punctuation 837
 
0.2%
Decimal Number 296
 
0.1%
Open Punctuation 198
 
< 0.1%
Close Punctuation 198
 
< 0.1%
Dash Punctuation 182
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o 83011
18.5%
n 78739
17.6%
e 76503
17.1%
r 75316
16.8%
z 72657
16.2%
l 14346
 
3.2%
a 13325
 
3.0%
t 10601
 
2.4%
i 8777
 
2.0%
h 6372
 
1.4%
Other values (13) 8412
 
1.9%
Uppercase Letter
ValueCountFrequency (%)
F 72306
85.1%
E 4957
 
5.8%
V 3865
 
4.5%
W 2256
 
2.7%
P 626
 
0.7%
B 456
 
0.5%
A 243
 
0.3%
D 73
 
0.1%
L 49
 
0.1%
S 37
 
< 0.1%
Other values (5) 121
 
0.1%
Other Punctuation
ValueCountFrequency (%)
; 641
76.6%
% 148
 
17.7%
' 48
 
5.7%
Decimal Number
ValueCountFrequency (%)
9 148
50.0%
5 148
50.0%
Space Separator
ValueCountFrequency (%)
4518
100.0%
Open Punctuation
ValueCountFrequency (%)
( 198
100.0%
Close Punctuation
ValueCountFrequency (%)
) 198
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 182
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 533048
98.8%
Common 6229
 
1.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
o 83011
15.6%
n 78739
14.8%
e 76503
14.4%
r 75316
14.1%
z 72657
13.6%
F 72306
13.6%
l 14346
 
2.7%
a 13325
 
2.5%
t 10601
 
2.0%
i 8777
 
1.6%
Other values (28) 27467
 
5.2%
Common
ValueCountFrequency (%)
4518
72.5%
; 641
 
10.3%
( 198
 
3.2%
) 198
 
3.2%
- 182
 
2.9%
9 148
 
2.4%
% 148
 
2.4%
5 148
 
2.4%
' 48
 
0.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 539277
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
o 83011
15.4%
n 78739
14.6%
e 76503
14.2%
r 75316
14.0%
z 72657
13.5%
F 72306
13.4%
l 14346
 
2.7%
a 13325
 
2.5%
t 10601
 
2.0%
i 8777
 
1.6%
Other values (37) 33696
6.2%
Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.6 MiB
2025-01-14T11:34:38.983004image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length13
Median length13
Mean length12.3834919
Min length2

Characters and Unicode

Total characters4191069
Distinct characters13
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowin collection
2nd rowin collection
3rd rowin collection
4th rowin collection
5th rowin collection
ValueCountFrequency (%)
in 298638
46.9%
collection 298638
46.9%
consumed 38038
 
6.0%
yes 943
 
0.1%
no 821
 
0.1%
2025-01-14T11:34:39.086427image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
n 636135
15.2%
o 636135
15.2%
c 635314
15.2%
i 597276
14.3%
l 597276
14.3%
e 337619
8.1%
298638
7.1%
t 298638
7.1%
s 38981
 
0.9%
u 38038
 
0.9%
Other values (3) 77019
 
1.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 3892431
92.9%
Space Separator 298638
 
7.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
n 636135
16.3%
o 636135
16.3%
c 635314
16.3%
i 597276
15.3%
l 597276
15.3%
e 337619
8.7%
t 298638
7.7%
s 38981
 
1.0%
u 38038
 
1.0%
m 38038
 
1.0%
Other values (2) 38981
 
1.0%
Space Separator
ValueCountFrequency (%)
298638
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 3892431
92.9%
Common 298638
 
7.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
n 636135
16.3%
o 636135
16.3%
c 635314
16.3%
i 597276
15.3%
l 597276
15.3%
e 337619
8.7%
t 298638
7.7%
s 38981
 
1.0%
u 38038
 
1.0%
m 38038
 
1.0%
Other values (2) 38981
 
1.0%
Common
ValueCountFrequency (%)
298638
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4191069
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
n 636135
15.2%
o 636135
15.2%
c 635314
15.2%
i 597276
14.3%
l 597276
14.3%
e 337619
8.1%
298638
7.1%
t 298638
7.1%
s 38981
 
0.9%
u 38038
 
0.9%
Other values (3) 77019
 
1.8%

associatedMedia
Text

Missing 

Distinct11559
Distinct (%)79.4%
Missing323875
Missing (%)95.7%
Memory size2.6 MiB
2025-01-14T11:34:39.242323image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length369
Median length49
Mean length52.45417096
Min length49

Characters and Unicode

Total characters763995
Distinct characters31
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique9279 ?
Unique (%)63.7%

Sample

1st rowhttps://collections.nmnh.si.edu/media/?i=15102863
2nd rowhttps://collections.nmnh.si.edu/media/?i=15392053
3rd rowhttps://collections.nmnh.si.edu/media/?i=15102609
4th rowhttps://collections.nmnh.si.edu/media/?i=15102164
5th rowhttps://collections.nmnh.si.edu/media/?i=15100806
ValueCountFrequency (%)
https://collections.nmnh.si.edu/media/?i=16192884 83
 
0.4%
https://collections.nmnh.si.edu/media/?i=14723169 38
 
0.2%
14723158 38
 
0.2%
https://collections.nmnh.si.edu/media/?i=13853473 34
 
0.2%
https://collections.nmnh.si.edu/media/?i=14322468 30
 
0.2%
https://collections.nmnh.si.edu/media/?i=13822124 28
 
0.1%
https://collections.nmnh.si.edu/media/?i=13812183 28
 
0.1%
https://collections.nmnh.si.edu/media/?i=13812175 28
 
0.1%
https://collections.nmnh.si.edu/media/?i=13812196 24
 
0.1%
https://collections.nmnh.si.edu/media/?i=13858205 22
 
0.1%
Other values (14612) 19243
98.2%
2025-01-14T11:34:39.473916image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
i 58260
 
7.6%
/ 58260
 
7.6%
t 43695
 
5.7%
s 43695
 
5.7%
. 43695
 
5.7%
n 43695
 
5.7%
e 43695
 
5.7%
1 38923
 
5.1%
d 29130
 
3.8%
m 29130
 
3.8%
Other values (21) 331817
43.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 451515
59.1%
Decimal Number 156768
 
20.5%
Other Punctuation 136116
 
17.8%
Math Symbol 14565
 
1.9%
Space Separator 5031
 
0.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 58260
12.9%
t 43695
9.7%
s 43695
9.7%
n 43695
9.7%
e 43695
9.7%
d 29130
 
6.5%
m 29130
 
6.5%
h 29130
 
6.5%
o 29130
 
6.5%
c 29130
 
6.5%
Other values (4) 72825
16.1%
Decimal Number
ValueCountFrequency (%)
1 38923
24.8%
5 21259
13.6%
4 18285
11.7%
2 13953
 
8.9%
0 13856
 
8.8%
3 12557
 
8.0%
9 12178
 
7.8%
7 10019
 
6.4%
8 8006
 
5.1%
6 7732
 
4.9%
Other Punctuation
ValueCountFrequency (%)
/ 58260
42.8%
. 43695
32.1%
? 14565
 
10.7%
: 14565
 
10.7%
; 5031
 
3.7%
Math Symbol
ValueCountFrequency (%)
= 14565
100.0%
Space Separator
ValueCountFrequency (%)
5031
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 451515
59.1%
Common 312480
40.9%

Most frequent character per script

Common
ValueCountFrequency (%)
/ 58260
18.6%
. 43695
14.0%
1 38923
12.5%
5 21259
 
6.8%
4 18285
 
5.9%
= 14565
 
4.7%
? 14565
 
4.7%
: 14565
 
4.7%
2 13953
 
4.5%
0 13856
 
4.4%
Other values (7) 60554
19.4%
Latin
ValueCountFrequency (%)
i 58260
12.9%
t 43695
9.7%
s 43695
9.7%
n 43695
9.7%
e 43695
9.7%
d 29130
 
6.5%
m 29130
 
6.5%
h 29130
 
6.5%
o 29130
 
6.5%
c 29130
 
6.5%
Other values (4) 72825
16.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 763995
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i 58260
 
7.6%
/ 58260
 
7.6%
t 43695
 
5.7%
s 43695
 
5.7%
. 43695
 
5.7%
n 43695
 
5.7%
e 43695
 
5.7%
1 38923
 
5.1%
d 29130
 
3.8%
m 29130
 
3.8%
Other values (21) 331817
43.4%

associatedSequences
Text

Missing 

Distinct25157
Distinct (%)76.9%
Missing305730
Missing (%)90.3%
Memory size2.6 MiB
2025-01-14T11:34:39.679371image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length228
Median length8
Mean length11.35741363
Min length8

Characters and Unicode

Total characters371501
Distinct characters35
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique17604 ?
Unique (%)53.8%

Sample

1st rowMW204230; MW124559
2nd rowMW982336
3rd rowMF785606; MF785913
4th rowMN344605
5th rowJQ840329
ValueCountFrequency (%)
prjna345052 17
 
< 0.1%
prjna396973 12
 
< 0.1%
mn345496 2
 
< 0.1%
mw982402 2
 
< 0.1%
mg968118 2
 
< 0.1%
mn344953 2
 
< 0.1%
mw983235 2
 
< 0.1%
mw983078 2
 
< 0.1%
mn345717 2
 
< 0.1%
mw277973 2
 
< 0.1%
Other values (35813) 43421
99.9%
2025-01-14T11:34:39.955365image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
8 34043
 
9.2%
3 32605
 
8.8%
4 30970
 
8.3%
9 27689
 
7.5%
M 27269
 
7.3%
2 27091
 
7.3%
7 23954
 
6.4%
0 23205
 
6.2%
5 21376
 
5.8%
1 21132
 
5.7%
Other values (25) 102167
27.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 262170
70.6%
Uppercase Letter 87817
 
23.6%
Other Punctuation 10758
 
2.9%
Space Separator 10756
 
2.9%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
M 27269
31.1%
W 9704
 
11.1%
O 9366
 
10.7%
Q 7274
 
8.3%
N 6102
 
6.9%
F 4776
 
5.4%
J 4067
 
4.6%
H 3584
 
4.1%
K 3130
 
3.6%
P 2662
 
3.0%
Other values (11) 9883
 
11.3%
Decimal Number
ValueCountFrequency (%)
8 34043
13.0%
3 32605
12.4%
4 30970
11.8%
9 27689
10.6%
2 27091
10.3%
7 23954
9.1%
0 23205
8.9%
5 21376
8.2%
1 21132
8.1%
6 20105
7.7%
Other Punctuation
ValueCountFrequency (%)
; 10756
> 99.9%
/ 1
 
< 0.1%
. 1
 
< 0.1%
Space Separator
ValueCountFrequency (%)
10756
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 283684
76.4%
Latin 87817
 
23.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
M 27269
31.1%
W 9704
 
11.1%
O 9366
 
10.7%
Q 7274
 
8.3%
N 6102
 
6.9%
F 4776
 
5.4%
J 4067
 
4.6%
H 3584
 
4.1%
K 3130
 
3.6%
P 2662
 
3.0%
Other values (11) 9883
 
11.3%
Common
ValueCountFrequency (%)
8 34043
12.0%
3 32605
11.5%
4 30970
10.9%
9 27689
9.8%
2 27091
9.5%
7 23954
8.4%
0 23205
8.2%
5 21376
7.5%
1 21132
7.4%
6 20105
7.1%
Other values (4) 21514
7.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 371501
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
8 34043
 
9.2%
3 32605
 
8.8%
4 30970
 
8.3%
9 27689
 
7.5%
M 27269
 
7.3%
2 27091
 
7.3%
7 23954
 
6.4%
0 23205
 
6.2%
5 21376
 
5.8%
1 21132
 
5.7%
Other values (25) 102167
27.5%

occurrenceRemarks
Text

Missing 

Distinct28700
Distinct (%)19.8%
Missing193737
Missing (%)57.2%
Memory size2.6 MiB
2025-01-14T11:34:40.168409image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length120818
Median length61
Mean length79.26073406
Min length1

Characters and Unicode

Total characters11469266
Distinct characters116
Distinct categories14 ?
Distinct scripts2 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique19961 ?
Unique (%)13.8%

Sample

1st rowOne leg removed for genetic sampling while on loan to GUELPH.
2nd rowOrder: 10948; Box Number: MBARI_0136: Box Position: B/4
3rd rowOne leg removed for genetic sampling while on loan to GUELPH.
4th rowOriginally cataloged as an image record because field notes indicated there was a photovoucher for the specimen. When the images were cataloged in early 2020, no photos were found for this specimen so the record was changed to a Genetic Sample (DNA) with no voucher.
5th rowEntire tissue sample consumed for DNA extraction. Specimen voucher located at Museum National d'Histoire Naturelle, Paris.
ValueCountFrequency (%)
for 114843
 
6.1%
on 113412
 
6.0%
to 111958
 
5.9%
genetic 110770
 
5.9%
while 109786
 
5.8%
sampling 108913
 
5.8%
loan 108870
 
5.8%
removed 108857
 
5.8%
guelph 108797
 
5.8%
one 105620
 
5.6%
Other values (39978) 787611
41.7%
2025-01-14T11:34:40.458215image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1726121
 
15.0%
e 1095754
 
9.6%
o 790058
 
6.9%
n 707325
 
6.2%
l 585110
 
5.1%
i 570424
 
5.0%
a 443981
 
3.9%
t 412504
 
3.6%
r 412405
 
3.6%
g 358988
 
3.1%
Other values (106) 4366596
38.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 7289984
63.6%
Space Separator 1726121
 
15.0%
Uppercase Letter 1411685
 
12.3%
Decimal Number 488632
 
4.3%
Other Punctuation 397122
 
3.5%
Control 85346
 
0.7%
Dash Punctuation 28244
 
0.2%
Math Symbol 18258
 
0.2%
Connector Punctuation 15626
 
0.1%
Open Punctuation 4114
 
< 0.1%
Other values (4) 4134
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 1095754
15.0%
o 790058
10.8%
n 707325
9.7%
l 585110
 
8.0%
i 570424
 
7.8%
a 443981
 
6.1%
t 412504
 
5.7%
r 412405
 
5.7%
g 358988
 
4.9%
m 292695
 
4.0%
Other values (31) 1620740
22.2%
Uppercase Letter
ValueCountFrequency (%)
P 144183
10.2%
O 140094
9.9%
G 139730
9.9%
H 124624
8.8%
U 124299
8.8%
E 121592
8.6%
L 117111
 
8.3%
B 74857
 
5.3%
N 66333
 
4.7%
M 62836
 
4.5%
Other values (19) 296026
21.0%
Other Punctuation
ValueCountFrequency (%)
. 176037
44.3%
: 84185
21.2%
; 72201
18.2%
, 34683
 
8.7%
/ 19654
 
4.9%
' 3148
 
0.8%
" 3113
 
0.8%
# 2394
 
0.6%
& 1017
 
0.3%
? 575
 
0.1%
Other values (4) 115
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
1 88589
18.1%
0 74354
15.2%
2 52624
10.8%
9 47648
9.8%
3 42145
8.6%
4 38220
7.8%
5 37436
7.7%
6 36920
7.6%
8 36777
7.5%
7 33919
 
6.9%
Math Symbol
ValueCountFrequency (%)
| 17759
97.3%
= 399
 
2.2%
+ 57
 
0.3%
~ 17
 
0.1%
< 16
 
0.1%
> 10
 
0.1%
Close Punctuation
ValueCountFrequency (%)
) 3436
83.6%
] 667
 
16.2%
} 6
 
0.1%
Other Symbol
ValueCountFrequency (%)
° 14
60.9%
5
 
21.7%
4
 
17.4%
Control
ValueCountFrequency (%)
84899
99.5%
447
 
0.5%
Dash Punctuation
ValueCountFrequency (%)
- 28001
99.1%
243
 
0.9%
Open Punctuation
ValueCountFrequency (%)
( 3444
83.7%
[ 670
 
16.3%
Space Separator
ValueCountFrequency (%)
1726121
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 15626
100.0%
Final Punctuation
ValueCountFrequency (%)
» 1
100.0%
Currency Symbol
ValueCountFrequency (%)
¢ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 8701657
75.9%
Common 2767609
 
24.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 1095754
 
12.6%
o 790058
 
9.1%
n 707325
 
8.1%
l 585110
 
6.7%
i 570424
 
6.6%
a 443981
 
5.1%
t 412504
 
4.7%
r 412405
 
4.7%
g 358988
 
4.1%
m 292695
 
3.4%
Other values (59) 3032413
34.8%
Common
ValueCountFrequency (%)
1726121
62.4%
. 176037
 
6.4%
1 88589
 
3.2%
84899
 
3.1%
: 84185
 
3.0%
0 74354
 
2.7%
; 72201
 
2.6%
2 52624
 
1.9%
9 47648
 
1.7%
3 42145
 
1.5%
Other values (37) 318806
 
11.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 11468894
> 99.9%
Punctuation 243
 
< 0.1%
None 120
 
< 0.1%
Misc Symbols 9
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1726121
 
15.1%
e 1095754
 
9.6%
o 790058
 
6.9%
n 707325
 
6.2%
l 585110
 
5.1%
i 570424
 
5.0%
a 443981
 
3.9%
t 412504
 
3.6%
r 412405
 
3.6%
g 358988
 
3.1%
Other values (81) 4366224
38.1%
Punctuation
ValueCountFrequency (%)
243
100.0%
None
ValueCountFrequency (%)
é 24
20.0%
ã 15
12.5%
° 14
11.7%
í 14
11.7%
µ 12
10.0%
ó 9
 
7.5%
Î 4
 
3.3%
ç 4
 
3.3%
á 4
 
3.3%
¿ 3
 
2.5%
Other values (12) 17
14.2%
Misc Symbols
ValueCountFrequency (%)
5
55.6%
4
44.4%

organismID
Text

Missing 

Distinct4
Distinct (%)100.0%
Missing338436
Missing (%)> 99.9%
Memory size2.6 MiB
2025-01-14T11:34:40.518742image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length13
Median length12.5
Mean length11.25
Min length10

Characters and Unicode

Total characters45
Distinct characters19
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4 ?
Unique (%)100.0%

Sample

1st row56°07'00"W
2nd row2°27'29.11"W
3rd row57°39'00"W
4th row138deg49'41"E
ValueCountFrequency (%)
56°07'00"w 1
25.0%
2°27'29.11"w 1
25.0%
57°39'00"w 1
25.0%
138deg49'41"e 1
25.0%
2025-01-14T11:34:40.757945image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 5
11.1%
1 4
 
8.9%
' 4
 
8.9%
" 4
 
8.9%
9 3
 
6.7%
° 3
 
6.7%
7 3
 
6.7%
W 3
 
6.7%
2 3
 
6.7%
4 2
 
4.4%
Other values (9) 11
24.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 26
57.8%
Other Punctuation 9
 
20.0%
Uppercase Letter 4
 
8.9%
Other Symbol 3
 
6.7%
Lowercase Letter 3
 
6.7%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 5
19.2%
1 4
15.4%
9 3
11.5%
7 3
11.5%
2 3
11.5%
4 2
 
7.7%
3 2
 
7.7%
5 2
 
7.7%
8 1
 
3.8%
6 1
 
3.8%
Other Punctuation
ValueCountFrequency (%)
' 4
44.4%
" 4
44.4%
. 1
 
11.1%
Lowercase Letter
ValueCountFrequency (%)
d 1
33.3%
e 1
33.3%
g 1
33.3%
Uppercase Letter
ValueCountFrequency (%)
W 3
75.0%
E 1
 
25.0%
Other Symbol
ValueCountFrequency (%)
° 3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 38
84.4%
Latin 7
 
15.6%

Most frequent character per script

Common
ValueCountFrequency (%)
0 5
13.2%
1 4
10.5%
' 4
10.5%
" 4
10.5%
9 3
7.9%
° 3
7.9%
7 3
7.9%
2 3
7.9%
4 2
 
5.3%
3 2
 
5.3%
Other values (4) 5
13.2%
Latin
ValueCountFrequency (%)
W 3
42.9%
d 1
 
14.3%
e 1
 
14.3%
g 1
 
14.3%
E 1
 
14.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 42
93.3%
None 3
 
6.7%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 5
11.9%
1 4
9.5%
' 4
9.5%
" 4
9.5%
9 3
 
7.1%
7 3
 
7.1%
W 3
 
7.1%
2 3
 
7.1%
4 2
 
4.8%
3 2
 
4.8%
Other values (8) 9
21.4%
None
ValueCountFrequency (%)
° 3
100.0%

organismName
Text

Missing 

Distinct2
Distinct (%)100.0%
Missing338438
Missing (%)> 99.9%
Memory size2.6 MiB
2025-01-14T11:34:40.802959image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length5
Median length5
Mean length5
Min length5

Characters and Unicode

Total characters10
Distinct characters7
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)100.0%

Sample

1st row918.0
2nd row651.0
ValueCountFrequency (%)
918.0 1
50.0%
651.0 1
50.0%
2025-01-14T11:34:40.896231image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 2
20.0%
. 2
20.0%
0 2
20.0%
9 1
10.0%
8 1
10.0%
6 1
10.0%
5 1
10.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 8
80.0%
Other Punctuation 2
 
20.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 2
25.0%
0 2
25.0%
9 1
12.5%
8 1
12.5%
6 1
12.5%
5 1
12.5%
Other Punctuation
ValueCountFrequency (%)
. 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 10
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 2
20.0%
. 2
20.0%
0 2
20.0%
9 1
10.0%
8 1
10.0%
6 1
10.0%
5 1
10.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 10
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 2
20.0%
. 2
20.0%
0 2
20.0%
9 1
10.0%
8 1
10.0%
6 1
10.0%
5 1
10.0%

organismScope
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing338439
Missing (%)> 99.9%
Memory size2.6 MiB
2025-01-14T11:34:40.937249image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length5
Median length5
Mean length5
Min length5

Characters and Unicode

Total characters5
Distinct characters5
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st row963.0
ValueCountFrequency (%)
963.0 1
100.0%
2025-01-14T11:34:41.028949image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
9 1
20.0%
6 1
20.0%
3 1
20.0%
. 1
20.0%
0 1
20.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 4
80.0%
Other Punctuation 1
 
20.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
9 1
25.0%
6 1
25.0%
3 1
25.0%
0 1
25.0%
Other Punctuation
ValueCountFrequency (%)
. 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 5
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
9 1
20.0%
6 1
20.0%
3 1
20.0%
. 1
20.0%
0 1
20.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 5
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
9 1
20.0%
6 1
20.0%
3 1
20.0%
. 1
20.0%
0 1
20.0%

materialSampleID
Text

Missing 

Distinct253362
Distinct (%)100.0%
Missing85078
Missing (%)25.1%
Memory size2.6 MiB
2025-01-14T11:34:41.316643image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length7
Median length7
Mean length7
Min length7

Characters and Unicode

Total characters1773534
Distinct characters36
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique253362 ?
Unique (%)100.0%

Sample

1st rowAR5TC43
2nd rowAL2IC84
3rd rowAF9HI08
4th rowAD5JZ99
5th rowAE0OQ35
ValueCountFrequency (%)
ar5tc43 1
 
< 0.1%
ae3rz90 1
 
< 0.1%
am1rc30 1
 
< 0.1%
al5lg46 1
 
< 0.1%
an9jb30 1
 
< 0.1%
af9hi08 1
 
< 0.1%
ad5jz99 1
 
< 0.1%
ae0oq35 1
 
< 0.1%
an7hd65 1
 
< 0.1%
ak3zy87 1
 
< 0.1%
Other values (253352) 253352
> 99.9%
2025-01-14T11:34:41.677210image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
A 287914
 
16.2%
7 79879
 
4.5%
1 77955
 
4.4%
2 77347
 
4.4%
0 77102
 
4.3%
4 76676
 
4.3%
5 76415
 
4.3%
3 76089
 
4.3%
9 75065
 
4.2%
6 73706
 
4.2%
Other values (26) 795386
44.8%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 1013448
57.1%
Decimal Number 760086
42.9%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A 287914
28.4%
O 39979
 
3.9%
R 39164
 
3.9%
K 38882
 
3.8%
E 36182
 
3.6%
C 35417
 
3.5%
L 34815
 
3.4%
H 34226
 
3.4%
I 33753
 
3.3%
F 33691
 
3.3%
Other values (16) 399425
39.4%
Decimal Number
ValueCountFrequency (%)
7 79879
10.5%
1 77955
10.3%
2 77347
10.2%
0 77102
10.1%
4 76676
10.1%
5 76415
10.1%
3 76089
10.0%
9 75065
9.9%
6 73706
9.7%
8 69852
9.2%

Most occurring scripts

ValueCountFrequency (%)
Latin 1013448
57.1%
Common 760086
42.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 287914
28.4%
O 39979
 
3.9%
R 39164
 
3.9%
K 38882
 
3.8%
E 36182
 
3.6%
C 35417
 
3.5%
L 34815
 
3.4%
H 34226
 
3.4%
I 33753
 
3.3%
F 33691
 
3.3%
Other values (16) 399425
39.4%
Common
ValueCountFrequency (%)
7 79879
10.5%
1 77955
10.3%
2 77347
10.2%
0 77102
10.1%
4 76676
10.1%
5 76415
10.1%
3 76089
10.0%
9 75065
9.9%
6 73706
9.7%
8 69852
9.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1773534
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A 287914
 
16.2%
7 79879
 
4.5%
1 77955
 
4.4%
2 77347
 
4.4%
0 77102
 
4.3%
4 76676
 
4.3%
5 76415
 
4.3%
3 76089
 
4.3%
9 75065
 
4.2%
6 73706
 
4.2%
Other values (26) 795386
44.8%

eventType
Text

Missing 

Distinct4
Distinct (%)100.0%
Missing338436
Missing (%)> 99.9%
Memory size2.6 MiB
2025-01-14T11:34:41.743987image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length7
Median length7
Mean length6.75
Min length6

Characters and Unicode

Total characters27
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4 ?
Unique (%)100.0%

Sample

1st row10.6925
2nd row5.55461
3rd row7.1633
4th row5.80961
ValueCountFrequency (%)
10.6925 1
25.0%
5.55461 1
25.0%
7.1633 1
25.0%
5.80961 1
25.0%
2025-01-14T11:34:41.854273image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5 5
18.5%
1 4
14.8%
. 4
14.8%
6 4
14.8%
0 2
 
7.4%
9 2
 
7.4%
3 2
 
7.4%
2 1
 
3.7%
4 1
 
3.7%
7 1
 
3.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 23
85.2%
Other Punctuation 4
 
14.8%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
5 5
21.7%
1 4
17.4%
6 4
17.4%
0 2
 
8.7%
9 2
 
8.7%
3 2
 
8.7%
2 1
 
4.3%
4 1
 
4.3%
7 1
 
4.3%
8 1
 
4.3%
Other Punctuation
ValueCountFrequency (%)
. 4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 27
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
5 5
18.5%
1 4
14.8%
. 4
14.8%
6 4
14.8%
0 2
 
7.4%
9 2
 
7.4%
3 2
 
7.4%
2 1
 
3.7%
4 1
 
3.7%
7 1
 
3.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 27
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
5 5
18.5%
1 4
14.8%
. 4
14.8%
6 4
14.8%
0 2
 
7.4%
9 2
 
7.4%
3 2
 
7.4%
2 1
 
3.7%
4 1
 
3.7%
7 1
 
3.7%

fieldNumber
Text

Missing 

Distinct7070
Distinct (%)10.0%
Missing267431
Missing (%)79.0%
Memory size2.6 MiB
2025-01-14T11:34:42.043403image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length56
Median length43
Mean length11.55291583
Min length1

Characters and Unicode

Total characters820361
Distinct characters72
Distinct categories10 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2667 ?
Unique (%)3.8%

Sample

1st rowMBARI/T548
2nd rowMBIO/BIZ-231
3rd rowMoorea F-06-12
4th rowMBARI/T488
5th rowAL-4097
ValueCountFrequency (%)
cb 3400
 
3.7%
moorea 3156
 
3.5%
fp 1216
 
1.3%
lrp 1033
 
1.1%
bah 991
 
1.1%
tob 838
 
0.9%
cur 813
 
0.9%
mbio/080611_minv_014 626
 
0.7%
dgs 506
 
0.6%
sec18-07 504
 
0.6%
Other values (7241) 78083
85.6%
2025-01-14T11:34:42.318564image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 78674
 
9.6%
- 70165
 
8.6%
1 62347
 
7.6%
B 45941
 
5.6%
2 44017
 
5.4%
I 35464
 
4.3%
M 34420
 
4.2%
A 34162
 
4.2%
3 27081
 
3.3%
8 21679
 
2.6%
Other values (62) 366411
44.7%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 326117
39.8%
Decimal Number 325547
39.7%
Dash Punctuation 70165
 
8.6%
Lowercase Letter 34360
 
4.2%
Other Punctuation 26194
 
3.2%
Space Separator 20157
 
2.5%
Connector Punctuation 17780
 
2.2%
Math Symbol 37
 
< 0.1%
Open Punctuation 2
 
< 0.1%
Close Punctuation 2
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
B 45941
14.1%
I 35464
10.9%
M 34420
10.6%
A 34162
10.5%
R 18907
 
5.8%
S 18652
 
5.7%
O 17093
 
5.2%
C 16710
 
5.1%
L 16139
 
4.9%
U 13291
 
4.1%
Other values (16) 75338
23.1%
Lowercase Letter
ValueCountFrequency (%)
o 7987
23.2%
e 4721
13.7%
r 3987
11.6%
a 3742
10.9%
n 2355
 
6.9%
i 1842
 
5.4%
m 1834
 
5.3%
t 1781
 
5.2%
v 1564
 
4.6%
l 1062
 
3.1%
Other values (14) 3485
10.1%
Decimal Number
ValueCountFrequency (%)
0 78674
24.2%
1 62347
19.2%
2 44017
13.5%
3 27081
 
8.3%
8 21679
 
6.7%
6 20194
 
6.2%
4 19711
 
6.1%
7 19228
 
5.9%
5 17540
 
5.4%
9 15076
 
4.6%
Other Punctuation
ValueCountFrequency (%)
/ 21523
82.2%
; 4626
 
17.7%
. 18
 
0.1%
# 14
 
0.1%
: 12
 
< 0.1%
, 1
 
< 0.1%
Dash Punctuation
ValueCountFrequency (%)
- 70165
100.0%
Space Separator
ValueCountFrequency (%)
20157
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 17780
100.0%
Math Symbol
ValueCountFrequency (%)
> 37
100.0%
Open Punctuation
ValueCountFrequency (%)
( 2
100.0%
Close Punctuation
ValueCountFrequency (%)
) 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 459884
56.1%
Latin 360477
43.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
B 45941
12.7%
I 35464
 
9.8%
M 34420
 
9.5%
A 34162
 
9.5%
R 18907
 
5.2%
S 18652
 
5.2%
O 17093
 
4.7%
C 16710
 
4.6%
L 16139
 
4.5%
U 13291
 
3.7%
Other values (40) 109698
30.4%
Common
ValueCountFrequency (%)
0 78674
17.1%
- 70165
15.3%
1 62347
13.6%
2 44017
9.6%
3 27081
 
5.9%
8 21679
 
4.7%
/ 21523
 
4.7%
6 20194
 
4.4%
20157
 
4.4%
4 19711
 
4.3%
Other values (12) 74336
16.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 820361
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 78674
 
9.6%
- 70165
 
8.6%
1 62347
 
7.6%
B 45941
 
5.6%
2 44017
 
5.4%
I 35464
 
4.3%
M 34420
 
4.2%
A 34162
 
4.2%
3 27081
 
3.3%
8 21679
 
2.6%
Other values (62) 366411
44.7%

eventDate
Text

Missing 

Distinct23060
Distinct (%)7.2%
Missing16369
Missing (%)4.8%
Memory size2.6 MiB
2025-01-14T11:34:42.482857image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length24
Median length10
Mean length11.08453105
Min length4

Characters and Unicode

Total characters3570006
Distinct characters24
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1370 ?
Unique (%)0.4%

Sample

1st row1977-05-21
2nd row2003-04-05
3rd row2009-12-05
4th row2006-09-14
5th row2003-05-01/2003-05-13
ValueCountFrequency (%)
2018-03-19/2018-03-23 1120
 
0.3%
2016-02-22/2016-03-09 842
 
0.3%
2008-06-11 649
 
0.2%
2017-05-26 623
 
0.2%
2015-05-09 524
 
0.2%
2017-05-23 519
 
0.2%
2017-05-30 515
 
0.2%
2006-03-12 513
 
0.2%
2017-08-14 508
 
0.2%
2017-05-27 505
 
0.2%
Other values (23050) 315807
98.0%
2025-01-14T11:34:42.717705image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 772965
21.7%
- 704820
19.7%
1 577630
16.2%
2 414060
11.6%
9 302792
 
8.5%
8 151695
 
4.2%
7 140417
 
3.9%
6 131389
 
3.7%
5 123971
 
3.5%
3 120245
 
3.4%
Other values (14) 130022
 
3.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 2831662
79.3%
Dash Punctuation 704820
 
19.7%
Other Punctuation 33409
 
0.9%
Space Separator 54
 
< 0.1%
Lowercase Letter 52
 
< 0.1%
Uppercase Letter 7
 
< 0.1%
Open Punctuation 1
 
< 0.1%
Close Punctuation 1
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 772965
27.3%
1 577630
20.4%
2 414060
14.6%
9 302792
 
10.7%
8 151695
 
5.4%
7 140417
 
5.0%
6 131389
 
4.6%
5 123971
 
4.4%
3 120245
 
4.2%
4 96498
 
3.4%
Uppercase Letter
ValueCountFrequency (%)
G 2
28.6%
S 2
28.6%
W 1
14.3%
E 1
14.3%
P 1
14.3%
Other Punctuation
ValueCountFrequency (%)
/ 33225
99.4%
, 183
 
0.5%
: 1
 
< 0.1%
Lowercase Letter
ValueCountFrequency (%)
o 26
50.0%
r 26
50.0%
Dash Punctuation
ValueCountFrequency (%)
- 704820
100.0%
Space Separator
ValueCountFrequency (%)
54
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 3569947
> 99.9%
Latin 59
 
< 0.1%

Most frequent character per script

Common
ValueCountFrequency (%)
0 772965
21.7%
- 704820
19.7%
1 577630
16.2%
2 414060
11.6%
9 302792
 
8.5%
8 151695
 
4.2%
7 140417
 
3.9%
6 131389
 
3.7%
5 123971
 
3.5%
3 120245
 
3.4%
Other values (7) 129963
 
3.6%
Latin
ValueCountFrequency (%)
o 26
44.1%
r 26
44.1%
G 2
 
3.4%
S 2
 
3.4%
W 1
 
1.7%
E 1
 
1.7%
P 1
 
1.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3570006
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 772965
21.7%
- 704820
19.7%
1 577630
16.2%
2 414060
11.6%
9 302792
 
8.5%
8 151695
 
4.2%
7 140417
 
3.9%
6 131389
 
3.7%
5 123971
 
3.5%
3 120245
 
3.4%
Other values (14) 130022
 
3.6%

eventTime
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing338439
Missing (%)> 99.9%
Memory size2.6 MiB
2025-01-14T11:34:42.773703image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length5
Median length5
Mean length5
Min length5

Characters and Unicode

Total characters5
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st row94648
ValueCountFrequency (%)
94648 1
100.0%
2025-01-14T11:34:42.872133image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
4 2
40.0%
9 1
20.0%
6 1
20.0%
8 1
20.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 5
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
4 2
40.0%
9 1
20.0%
6 1
20.0%
8 1
20.0%

Most occurring scripts

ValueCountFrequency (%)
Common 5
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
4 2
40.0%
9 1
20.0%
6 1
20.0%
8 1
20.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 5
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
4 2
40.0%
9 1
20.0%
6 1
20.0%
8 1
20.0%

startDayOfYear
Text

Missing 

Distinct366
Distinct (%)0.1%
Missing18131
Missing (%)5.4%
Memory size2.6 MiB
2025-01-14T11:34:43.074564image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length3
Median length3
Mean length2.771973313
Min length1

Characters and Unicode

Total characters887888
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row141
2nd row95
3rd row339
4th row257
5th row121
ValueCountFrequency (%)
142 2414
 
0.8%
78 1966
 
0.6%
140 1912
 
0.6%
147 1847
 
0.6%
201 1845
 
0.6%
152 1832
 
0.6%
197 1819
 
0.6%
182 1814
 
0.6%
150 1806
 
0.6%
146 1793
 
0.6%
Other values (356) 301261
94.1%
2025-01-14T11:34:43.357828image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 193442
21.8%
2 157736
17.8%
3 100838
11.4%
4 66349
 
7.5%
5 65465
 
7.4%
7 63652
 
7.2%
0 61103
 
6.9%
6 61082
 
6.9%
8 59915
 
6.7%
9 58306
 
6.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 887888
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 193442
21.8%
2 157736
17.8%
3 100838
11.4%
4 66349
 
7.5%
5 65465
 
7.4%
7 63652
 
7.2%
0 61103
 
6.9%
6 61082
 
6.9%
8 59915
 
6.7%
9 58306
 
6.6%

Most occurring scripts

ValueCountFrequency (%)
Common 887888
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 193442
21.8%
2 157736
17.8%
3 100838
11.4%
4 66349
 
7.5%
5 65465
 
7.4%
7 63652
 
7.2%
0 61103
 
6.9%
6 61082
 
6.9%
8 59915
 
6.7%
9 58306
 
6.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 887888
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 193442
21.8%
2 157736
17.8%
3 100838
11.4%
4 66349
 
7.5%
5 65465
 
7.4%
7 63652
 
7.2%
0 61103
 
6.9%
6 61082
 
6.9%
8 59915
 
6.7%
9 58306
 
6.6%

endDayOfYear
Text

Missing 

Distinct366
Distinct (%)0.1%
Missing17911
Missing (%)5.3%
Memory size2.6 MiB
2025-01-14T11:34:43.573881image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length3
Median length3
Mean length2.778768848
Min length1

Characters and Unicode

Total characters890676
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row141
2nd row95
3rd row339
4th row257
5th row133
ValueCountFrequency (%)
142 2346
 
0.7%
151 2066
 
0.6%
150 2017
 
0.6%
82 1898
 
0.6%
212 1891
 
0.6%
143 1865
 
0.6%
69 1862
 
0.6%
197 1800
 
0.6%
146 1794
 
0.6%
147 1756
 
0.5%
Other values (356) 301234
94.0%
2025-01-14T11:34:43.850979image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 189931
21.3%
2 159764
17.9%
3 101857
11.4%
4 67311
 
7.6%
5 65248
 
7.3%
0 63136
 
7.1%
6 62126
 
7.0%
7 61959
 
7.0%
8 59962
 
6.7%
9 59382
 
6.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 890676
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 189931
21.3%
2 159764
17.9%
3 101857
11.4%
4 67311
 
7.6%
5 65248
 
7.3%
0 63136
 
7.1%
6 62126
 
7.0%
7 61959
 
7.0%
8 59962
 
6.7%
9 59382
 
6.7%

Most occurring scripts

ValueCountFrequency (%)
Common 890676
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 189931
21.3%
2 159764
17.9%
3 101857
11.4%
4 67311
 
7.6%
5 65248
 
7.3%
0 63136
 
7.1%
6 62126
 
7.0%
7 61959
 
7.0%
8 59962
 
6.7%
9 59382
 
6.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 890676
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 189931
21.3%
2 159764
17.9%
3 101857
11.4%
4 67311
 
7.6%
5 65248
 
7.3%
0 63136
 
7.1%
6 62126
 
7.0%
7 61959
 
7.0%
8 59962
 
6.7%
9 59382
 
6.7%

year
Text

Missing 

Distinct158
Distinct (%)< 0.1%
Missing16370
Missing (%)4.8%
Memory size2.6 MiB
2025-01-14T11:34:44.011042image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length4
Median length4
Mean length4
Min length4

Characters and Unicode

Total characters1288280
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4 ?
Unique (%)< 0.1%

Sample

1st row1977
2nd row2003
3rd row2009
4th row2006
5th row2003
ValueCountFrequency (%)
2009 14264
 
4.4%
2017 14067
 
4.4%
2015 13813
 
4.3%
2010 13737
 
4.3%
2012 12220
 
3.8%
2008 11987
 
3.7%
2016 11451
 
3.6%
2018 11086
 
3.4%
2019 9910
 
3.1%
2006 9459
 
2.9%
Other values (148) 200076
62.1%
2025-01-14T11:34:44.226233image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 296276
23.0%
1 274919
21.3%
2 223079
17.3%
9 214556
16.7%
8 70592
 
5.5%
7 60236
 
4.7%
6 50357
 
3.9%
5 38161
 
3.0%
3 30589
 
2.4%
4 29515
 
2.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1288280
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 296276
23.0%
1 274919
21.3%
2 223079
17.3%
9 214556
16.7%
8 70592
 
5.5%
7 60236
 
4.7%
6 50357
 
3.9%
5 38161
 
3.0%
3 30589
 
2.4%
4 29515
 
2.3%

Most occurring scripts

ValueCountFrequency (%)
Common 1288280
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 296276
23.0%
1 274919
21.3%
2 223079
17.3%
9 214556
16.7%
8 70592
 
5.5%
7 60236
 
4.7%
6 50357
 
3.9%
5 38161
 
3.0%
3 30589
 
2.4%
4 29515
 
2.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1288280
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 296276
23.0%
1 274919
21.3%
2 223079
17.3%
9 214556
16.7%
8 70592
 
5.5%
7 60236
 
4.7%
6 50357
 
3.9%
5 38161
 
3.0%
3 30589
 
2.4%
4 29515
 
2.3%

month
Text

Missing 

Distinct14
Distinct (%)< 0.1%
Missing17966
Missing (%)5.3%
Memory size2.6 MiB
2025-01-14T11:34:44.291620image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length11
Median length1
Mean length1.178235988
Min length1

Characters and Unicode

Total characters377594
Distinct characters13
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)< 0.1%

Sample

1st row5
2nd row4
3rd row12
4th row9
5th row5
ValueCountFrequency (%)
5 42822
13.4%
6 37403
11.7%
7 36987
11.5%
8 30887
9.6%
4 28913
9.0%
3 27542
8.6%
9 25613
8.0%
10 23427
7.3%
11 20374
6.4%
2 16876
 
5.3%
Other values (6) 29632
9.2%
2025-01-14T11:34:44.399748image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 93804
24.8%
5 42823
11.3%
6 37405
 
9.9%
7 36988
 
9.8%
8 30889
 
8.2%
2 30178
 
8.0%
4 28913
 
7.7%
3 27542
 
7.3%
9 25615
 
6.8%
0 23432
 
6.2%
Other values (3) 5
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 377589
> 99.9%
Other Punctuation 2
 
< 0.1%
Space Separator 2
 
< 0.1%
Uppercase Letter 1
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 93804
24.8%
5 42823
11.3%
6 37405
 
9.9%
7 36988
 
9.8%
8 30889
 
8.2%
2 30178
 
8.0%
4 28913
 
7.7%
3 27542
 
7.3%
9 25615
 
6.8%
0 23432
 
6.2%
Other Punctuation
ValueCountFrequency (%)
. 2
100.0%
Space Separator
ValueCountFrequency (%)
2
100.0%
Uppercase Letter
ValueCountFrequency (%)
N 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 377593
> 99.9%
Latin 1
 
< 0.1%

Most frequent character per script

Common
ValueCountFrequency (%)
1 93804
24.8%
5 42823
11.3%
6 37405
 
9.9%
7 36988
 
9.8%
8 30889
 
8.2%
2 30178
 
8.0%
4 28913
 
7.7%
3 27542
 
7.3%
9 25615
 
6.8%
0 23432
 
6.2%
Other values (2) 4
 
< 0.1%
Latin
ValueCountFrequency (%)
N 1
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 377594
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 93804
24.8%
5 42823
11.3%
6 37405
 
9.9%
7 36988
 
9.8%
8 30889
 
8.2%
2 30178
 
8.0%
4 28913
 
7.7%
3 27542
 
7.3%
9 25615
 
6.8%
0 23432
 
6.2%
Other values (3) 5
 
< 0.1%

day
Text

Missing 

Distinct33
Distinct (%)< 0.1%
Missing19384
Missing (%)5.7%
Memory size2.6 MiB
2025-01-14T11:34:44.471340image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length12
Median length2
Mean length1.689853819
Min length1

Characters and Unicode

Total characters539158
Distinct characters13
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)< 0.1%

Sample

1st row21
2nd row5
3rd row5
4th row14
5th row1
ValueCountFrequency (%)
1 18050
 
5.7%
15 11503
 
3.6%
11 11303
 
3.5%
12 11300
 
3.5%
5 11212
 
3.5%
22 11187
 
3.5%
16 11185
 
3.5%
10 11174
 
3.5%
8 10958
 
3.4%
19 10620
 
3.3%
Other values (25) 200566
62.9%
2025-01-14T11:34:44.603252image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 150942
28.0%
2 130384
24.2%
3 43011
 
8.0%
8 32046
 
5.9%
5 31740
 
5.9%
6 30889
 
5.7%
9 30427
 
5.6%
0 30285
 
5.6%
7 30076
 
5.6%
4 29353
 
5.4%
Other values (3) 5
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 539153
> 99.9%
Other Punctuation 2
 
< 0.1%
Space Separator 2
 
< 0.1%
Uppercase Letter 1
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 150942
28.0%
2 130384
24.2%
3 43011
 
8.0%
8 32046
 
5.9%
5 31740
 
5.9%
6 30889
 
5.7%
9 30427
 
5.6%
0 30285
 
5.6%
7 30076
 
5.6%
4 29353
 
5.4%
Other Punctuation
ValueCountFrequency (%)
. 2
100.0%
Space Separator
ValueCountFrequency (%)
2
100.0%
Uppercase Letter
ValueCountFrequency (%)
E 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 539157
> 99.9%
Latin 1
 
< 0.1%

Most frequent character per script

Common
ValueCountFrequency (%)
1 150942
28.0%
2 130384
24.2%
3 43011
 
8.0%
8 32046
 
5.9%
5 31740
 
5.9%
6 30889
 
5.7%
9 30427
 
5.6%
0 30285
 
5.6%
7 30076
 
5.6%
4 29353
 
5.4%
Other values (2) 4
 
< 0.1%
Latin
ValueCountFrequency (%)
E 1
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 539158
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 150942
28.0%
2 130384
24.2%
3 43011
 
8.0%
8 32046
 
5.9%
5 31740
 
5.9%
6 30889
 
5.7%
9 30427
 
5.6%
0 30285
 
5.6%
7 30076
 
5.6%
4 29353
 
5.4%
Other values (3) 5
 
< 0.1%

verbatimEventDate
Text

Missing 

Distinct10232
Distinct (%)10.0%
Missing236098
Missing (%)69.8%
Memory size2.6 MiB
2025-01-14T11:34:44.792119image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length72
Median length71
Mean length13.69964433
Min length1

Characters and Unicode

Total characters1402049
Distinct characters76
Distinct categories10 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2665 ?
Unique (%)2.6%

Sample

1st row4/5/2003 3:59:00 PM
2nd row2007 or prior, based on filename of source data sheet
3rd row14 Sep 2006
4th row10/11/2002 1:30:00 PM
5th row11 May 2014
ValueCountFrequency (%)
may 10963
 
3.6%
apr 6723
 
2.2%
pm 6654
 
2.2%
aug 5888
 
1.9%
5378
 
1.8%
2007 5232
 
1.7%
sep 5187
 
1.7%
mar 4910
 
1.6%
2008 4661
 
1.5%
june 4032
 
1.3%
Other values (3776) 243090
80.3%
2025-01-14T11:34:45.075392image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
200376
 
14.3%
0 159129
 
11.3%
1 143433
 
10.2%
2 117809
 
8.4%
9 73246
 
5.2%
e 40969
 
2.9%
8 37322
 
2.7%
a 35837
 
2.6%
3 32888
 
2.3%
r 32308
 
2.3%
Other values (66) 528732
37.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 671471
47.9%
Lowercase Letter 314014
22.4%
Space Separator 200376
 
14.3%
Uppercase Letter 112938
 
8.1%
Other Punctuation 64411
 
4.6%
Dash Punctuation 30846
 
2.2%
Open Punctuation 3949
 
0.3%
Close Punctuation 3949
 
0.3%
Math Symbol 81
 
< 0.1%
Connector Punctuation 14
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 40969
13.0%
a 35837
11.4%
r 32308
10.3%
u 27482
8.8%
t 25389
 
8.1%
p 19868
 
6.3%
n 18276
 
5.8%
y 16752
 
5.3%
o 15217
 
4.8%
c 13242
 
4.2%
Other values (15) 68674
21.9%
Uppercase Letter
ValueCountFrequency (%)
M 25479
22.6%
J 20887
18.5%
A 19556
17.3%
S 12309
10.9%
N 7624
 
6.8%
P 6958
 
6.2%
D 5028
 
4.5%
O 4899
 
4.3%
F 4345
 
3.8%
E 1480
 
1.3%
Other values (11) 4373
 
3.9%
Other Punctuation
ValueCountFrequency (%)
: 32233
50.0%
/ 13156
20.4%
. 8673
 
13.5%
; 8159
 
12.7%
, 2132
 
3.3%
? 16
 
< 0.1%
* 15
 
< 0.1%
' 9
 
< 0.1%
& 6
 
< 0.1%
# 6
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
0 159129
23.7%
1 143433
21.4%
2 117809
17.5%
9 73246
10.9%
8 37322
 
5.6%
3 32888
 
4.9%
5 31315
 
4.7%
7 28026
 
4.2%
4 24321
 
3.6%
6 23982
 
3.6%
Dash Punctuation
ValueCountFrequency (%)
- 30565
99.1%
281
 
0.9%
Open Punctuation
ValueCountFrequency (%)
[ 3931
99.5%
( 18
 
0.5%
Close Punctuation
ValueCountFrequency (%)
] 3931
99.5%
) 18
 
0.5%
Space Separator
ValueCountFrequency (%)
200376
100.0%
Math Symbol
ValueCountFrequency (%)
| 81
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 14
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 975097
69.5%
Latin 426952
30.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 40969
 
9.6%
a 35837
 
8.4%
r 32308
 
7.6%
u 27482
 
6.4%
M 25479
 
6.0%
t 25389
 
5.9%
J 20887
 
4.9%
p 19868
 
4.7%
A 19556
 
4.6%
n 18276
 
4.3%
Other values (36) 160901
37.7%
Common
ValueCountFrequency (%)
200376
20.5%
0 159129
16.3%
1 143433
14.7%
2 117809
12.1%
9 73246
 
7.5%
8 37322
 
3.8%
3 32888
 
3.4%
: 32233
 
3.3%
5 31315
 
3.2%
- 30565
 
3.1%
Other values (20) 116781
12.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1401768
> 99.9%
Punctuation 281
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
200376
 
14.3%
0 159129
 
11.4%
1 143433
 
10.2%
2 117809
 
8.4%
9 73246
 
5.2%
e 40969
 
2.9%
8 37322
 
2.7%
a 35837
 
2.6%
3 32888
 
2.3%
r 32308
 
2.3%
Other values (65) 528451
37.7%
Punctuation
ValueCountFrequency (%)
281
100.0%

habitat
Text

Missing 

Distinct5075
Distinct (%)14.1%
Missing302334
Missing (%)89.3%
Memory size2.6 MiB
2025-01-14T11:34:45.277482image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length382
Median length180
Mean length39.97399324
Min length1

Characters and Unicode

Total characters1443301
Distinct characters87
Distinct categories10 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1916 ?
Unique (%)5.3%

Sample

1st rowRocky slope with scattered shrubs. Moist soil on slope
2nd rowScrubland
3rd rowEcological remarks by collector(s): yes
4th rowCultivated/garden
5th rowbrushed from under rubble
ValueCountFrequency (%)
forest 9242
 
4.6%
and 8086
 
4.0%
with 6443
 
3.2%
by 4854
 
2.4%
ecological 4350
 
2.2%
remarks 4350
 
2.2%
collector(s 4345
 
2.1%
in 4302
 
2.1%
yes 3549
 
1.8%
slopes 2423
 
1.2%
Other values (4259) 150202
74.3%
2025-01-14T11:34:45.560642image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
166040
 
11.5%
e 123018
 
8.5%
a 115153
 
8.0%
r 97879
 
6.8%
o 97117
 
6.7%
s 87783
 
6.1%
i 77223
 
5.4%
n 74016
 
5.1%
t 69226
 
4.8%
l 65379
 
4.5%
Other values (77) 470467
32.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1159779
80.4%
Space Separator 166040
 
11.5%
Uppercase Letter 57713
 
4.0%
Other Punctuation 44027
 
3.1%
Open Punctuation 5088
 
0.4%
Close Punctuation 5084
 
0.4%
Decimal Number 3129
 
0.2%
Dash Punctuation 2289
 
0.2%
Math Symbol 151
 
< 0.1%
Connector Punctuation 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 123018
10.6%
a 115153
 
9.9%
r 97879
 
8.4%
o 97117
 
8.4%
s 87783
 
7.6%
i 77223
 
6.7%
n 74016
 
6.4%
t 69226
 
6.0%
l 65379
 
5.6%
c 53034
 
4.6%
Other values (17) 299951
25.9%
Uppercase Letter
ValueCountFrequency (%)
E 6028
 
10.4%
C 5586
 
9.7%
S 5345
 
9.3%
A 5268
 
9.1%
P 4287
 
7.4%
M 4226
 
7.3%
R 4148
 
7.2%
B 3104
 
5.4%
D 2370
 
4.1%
G 2065
 
3.6%
Other values (16) 15286
26.5%
Other Punctuation
ValueCountFrequency (%)
, 22821
51.8%
. 11654
26.5%
: 4682
 
10.6%
/ 2873
 
6.5%
; 1485
 
3.4%
& 182
 
0.4%
% 121
 
0.3%
" 101
 
0.2%
? 72
 
0.2%
' 24
 
0.1%
Other values (2) 12
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
0 715
22.9%
1 509
16.3%
2 394
12.6%
5 351
11.2%
3 259
 
8.3%
8 219
 
7.0%
4 197
 
6.3%
6 173
 
5.5%
7 162
 
5.2%
9 150
 
4.8%
Dash Punctuation
ValueCountFrequency (%)
- 2277
99.5%
8
 
0.3%
4
 
0.2%
Math Symbol
ValueCountFrequency (%)
~ 138
91.4%
+ 8
 
5.3%
< 5
 
3.3%
Open Punctuation
ValueCountFrequency (%)
( 5052
99.3%
[ 36
 
0.7%
Close Punctuation
ValueCountFrequency (%)
) 5048
99.3%
] 36
 
0.7%
Space Separator
ValueCountFrequency (%)
166040
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1217492
84.4%
Common 225809
 
15.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 123018
 
10.1%
a 115153
 
9.5%
r 97879
 
8.0%
o 97117
 
8.0%
s 87783
 
7.2%
i 77223
 
6.3%
n 74016
 
6.1%
t 69226
 
5.7%
l 65379
 
5.4%
c 53034
 
4.4%
Other values (43) 357664
29.4%
Common
ValueCountFrequency (%)
166040
73.5%
, 22821
 
10.1%
. 11654
 
5.2%
( 5052
 
2.2%
) 5048
 
2.2%
: 4682
 
2.1%
/ 2873
 
1.3%
- 2277
 
1.0%
; 1485
 
0.7%
0 715
 
0.3%
Other values (24) 3162
 
1.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1443283
> 99.9%
Punctuation 12
 
< 0.1%
None 6
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
166040
 
11.5%
e 123018
 
8.5%
a 115153
 
8.0%
r 97879
 
6.8%
o 97117
 
6.7%
s 87783
 
6.1%
i 77223
 
5.4%
n 74016
 
5.1%
t 69226
 
4.8%
l 65379
 
4.5%
Other values (74) 470449
32.6%
Punctuation
ValueCountFrequency (%)
8
66.7%
4
33.3%
None
ValueCountFrequency (%)
ñ 6
100.0%

eventRemarks
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing338439
Missing (%)> 99.9%
Memory size2.6 MiB
2025-01-14T11:34:45.634032image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length96
Median length96
Mean length96
Min length96

Characters and Unicode

Total characters96
Distinct characters33
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st rowGuide to Best Practices for Georeferencing. (Chapman and Wieczorek, eds. 2006). Google Earth Pro
ValueCountFrequency (%)
guide 1
 
7.1%
to 1
 
7.1%
best 1
 
7.1%
practices 1
 
7.1%
for 1
 
7.1%
georeferencing 1
 
7.1%
chapman 1
 
7.1%
and 1
 
7.1%
wieczorek 1
 
7.1%
eds 1
 
7.1%
Other values (4) 4
28.6%
2025-01-14T11:34:45.754116image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
13
 
13.5%
e 11
 
11.5%
o 7
 
7.3%
r 7
 
7.3%
a 5
 
5.2%
n 4
 
4.2%
i 4
 
4.2%
t 4
 
4.2%
c 4
 
4.2%
G 3
 
3.1%
Other values (23) 34
35.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 64
66.7%
Space Separator 13
 
13.5%
Uppercase Letter 9
 
9.4%
Other Punctuation 4
 
4.2%
Decimal Number 4
 
4.2%
Close Punctuation 1
 
1.0%
Open Punctuation 1
 
1.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 11
17.2%
o 7
10.9%
r 7
10.9%
a 5
7.8%
n 4
 
6.2%
i 4
 
6.2%
t 4
 
6.2%
c 4
 
6.2%
d 3
 
4.7%
s 3
 
4.7%
Other values (9) 12
18.8%
Uppercase Letter
ValueCountFrequency (%)
G 3
33.3%
P 2
22.2%
B 1
 
11.1%
W 1
 
11.1%
C 1
 
11.1%
E 1
 
11.1%
Decimal Number
ValueCountFrequency (%)
0 2
50.0%
6 1
25.0%
2 1
25.0%
Other Punctuation
ValueCountFrequency (%)
. 3
75.0%
, 1
 
25.0%
Space Separator
ValueCountFrequency (%)
13
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 73
76.0%
Common 23
 
24.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 11
15.1%
o 7
 
9.6%
r 7
 
9.6%
a 5
 
6.8%
n 4
 
5.5%
i 4
 
5.5%
t 4
 
5.5%
c 4
 
5.5%
G 3
 
4.1%
d 3
 
4.1%
Other values (15) 21
28.8%
Common
ValueCountFrequency (%)
13
56.5%
. 3
 
13.0%
0 2
 
8.7%
, 1
 
4.3%
) 1
 
4.3%
6 1
 
4.3%
2 1
 
4.3%
( 1
 
4.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 96
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
13
 
13.5%
e 11
 
11.5%
o 7
 
7.3%
r 7
 
7.3%
a 5
 
5.2%
n 4
 
4.2%
i 4
 
4.2%
t 4
 
4.2%
c 4
 
4.2%
G 3
 
3.1%
Other values (23) 34
35.4%

locationID
Text

Missing 

Distinct4571
Distinct (%)8.5%
Missing284922
Missing (%)84.2%
Memory size2.6 MiB
2025-01-14T11:34:45.947709image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length32
Median length28
Mean length6.812642475
Min length1

Characters and Unicode

Total characters364599
Distinct characters83
Distinct categories12 ?
Distinct scripts2 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1199 ?
Unique (%)2.2%

Sample

1st rowT548
2nd rowBIZ-231
3rd rowT488
4th row02-10
5th rowVES117
ValueCountFrequency (%)
080611_minv_014 627
 
1.1%
site 469
 
0.8%
i 457
 
0.8%
trawl 456
 
0.8%
serc 326
 
0.6%
14 313
 
0.6%
v1951 309
 
0.5%
080608_minv_012 289
 
0.5%
21 276
 
0.5%
10 275
 
0.5%
Other values (4452) 53080
93.3%
2025-01-14T11:34:46.222421image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 37162
 
10.2%
1 34715
 
9.5%
- 19207
 
5.3%
2 18298
 
5.0%
I 15967
 
4.4%
_ 15386
 
4.2%
5 13787
 
3.8%
4 13700
 
3.8%
8 13275
 
3.6%
6 12677
 
3.5%
Other values (73) 170425
46.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 176632
48.4%
Uppercase Letter 123892
34.0%
Lowercase Letter 23399
 
6.4%
Dash Punctuation 19207
 
5.3%
Connector Punctuation 15386
 
4.2%
Space Separator 3359
 
0.9%
Other Punctuation 2274
 
0.6%
Open Punctuation 203
 
0.1%
Close Punctuation 202
 
0.1%
Math Symbol 40
 
< 0.1%
Other values (2) 5
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
I 15967
12.9%
A 12422
 
10.0%
B 12233
 
9.9%
S 9801
 
7.9%
M 9595
 
7.7%
T 7531
 
6.1%
Z 6274
 
5.1%
O 5823
 
4.7%
N 5746
 
4.6%
V 4404
 
3.6%
Other values (18) 34096
27.5%
Lowercase Letter
ValueCountFrequency (%)
n 2393
10.2%
i 2331
10.0%
e 1964
 
8.4%
m 1947
 
8.3%
o 1871
 
8.0%
a 1852
 
7.9%
t 1743
 
7.4%
r 1556
 
6.6%
v 1293
 
5.5%
g 941
 
4.0%
Other values (17) 5508
23.5%
Decimal Number
ValueCountFrequency (%)
0 37162
21.0%
1 34715
19.7%
2 18298
10.4%
5 13787
 
7.8%
4 13700
 
7.8%
8 13275
 
7.5%
6 12677
 
7.2%
3 12433
 
7.0%
7 11387
 
6.4%
9 9198
 
5.2%
Other Punctuation
ValueCountFrequency (%)
/ 1119
49.2%
. 1044
45.9%
# 109
 
4.8%
, 2
 
0.1%
Open Punctuation
ValueCountFrequency (%)
( 196
96.6%
[ 6
 
3.0%
1
 
0.5%
Math Symbol
ValueCountFrequency (%)
> 37
92.5%
¬ 2
 
5.0%
+ 1
 
2.5%
Close Punctuation
ValueCountFrequency (%)
) 196
97.0%
] 6
 
3.0%
Currency Symbol
ValueCountFrequency (%)
¢ 3
75.0%
1
 
25.0%
Dash Punctuation
ValueCountFrequency (%)
- 19207
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 15386
100.0%
Space Separator
ValueCountFrequency (%)
3359
100.0%
Initial Punctuation
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 217308
59.6%
Latin 147291
40.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
I 15967
 
10.8%
A 12422
 
8.4%
B 12233
 
8.3%
S 9801
 
6.7%
M 9595
 
6.5%
T 7531
 
5.1%
Z 6274
 
4.3%
O 5823
 
4.0%
N 5746
 
3.9%
V 4404
 
3.0%
Other values (45) 57495
39.0%
Common
ValueCountFrequency (%)
0 37162
17.1%
1 34715
16.0%
- 19207
8.8%
2 18298
8.4%
_ 15386
7.1%
5 13787
 
6.3%
4 13700
 
6.3%
8 13275
 
6.1%
6 12677
 
5.8%
3 12433
 
5.7%
Other values (18) 26668
12.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 364581
> 99.9%
None 15
 
< 0.1%
Punctuation 2
 
< 0.1%
Currency Symbols 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 37162
 
10.2%
1 34715
 
9.5%
- 19207
 
5.3%
2 18298
 
5.0%
I 15967
 
4.4%
_ 15386
 
4.2%
5 13787
 
3.8%
4 13700
 
3.8%
8 13275
 
3.6%
6 12677
 
3.5%
Other values (62) 170407
46.7%
None
ValueCountFrequency (%)
à 3
20.0%
¢ 3
20.0%
 2
13.3%
â 2
13.3%
¬ 2
13.3%
ƒ 1
 
6.7%
š 1
 
6.7%
Å 1
 
6.7%
Currency Symbols
ValueCountFrequency (%)
1
100.0%
Punctuation
ValueCountFrequency (%)
1
50.0%
1
50.0%

higherGeography
Text

Missing 

Distinct7780
Distinct (%)2.3%
Missing4534
Missing (%)1.3%
Memory size2.6 MiB
2025-01-14T11:34:46.430569image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length128
Median length103
Mean length44.48238426
Min length3

Characters and Unicode

Total characters14852935
Distinct characters98
Distinct categories10 ?
Distinct scripts2 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique788 ?
Unique (%)0.2%

Sample

1st rowUnited States, Arizona, Cochise
2nd rowNorth Pacific Ocean, Gulf of California, Mexico
3rd rowSouth Pacific Ocean, French Polynesia, Society Islands, Moorea
4th rowUnited States, Arkansas
5th rowAsia-Temperate, China, Xizang, Nielamu (Nyalam) Xian
ValueCountFrequency (%)
states 150874
 
7.6%
united 150796
 
7.6%
north 101914
 
5.1%
ocean 69469
 
3.5%
pacific 66315
 
3.4%
america 65503
 
3.3%
stated 60374
 
3.0%
not 60374
 
3.0%
islands 44123
 
2.2%
atlantic 41421
 
2.1%
Other values (4526) 1168373
59.0%
2025-01-14T11:34:46.729420image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1645630
 
11.1%
a 1473689
 
9.9%
t 1109174
 
7.5%
e 1085474
 
7.3%
i 1041776
 
7.0%
n 861843
 
5.8%
, 826533
 
5.6%
o 731933
 
4.9%
r 620649
 
4.2%
s 542408
 
3.7%
Other values (88) 4913826
33.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 10257330
69.1%
Uppercase Letter 1954366
 
13.2%
Space Separator 1645630
 
11.1%
Other Punctuation 837556
 
5.6%
Close Punctuation 63081
 
0.4%
Open Punctuation 63081
 
0.4%
Dash Punctuation 30906
 
0.2%
Modifier Letter 813
 
< 0.1%
Decimal Number 169
 
< 0.1%
Math Symbol 3
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 1473689
14.4%
t 1109174
10.8%
e 1085474
10.6%
i 1041776
10.2%
n 861843
8.4%
o 731933
 
7.1%
r 620649
 
6.1%
s 542408
 
5.3%
c 501353
 
4.9%
l 396979
 
3.9%
Other values (36) 1892052
18.4%
Uppercase Letter
ValueCountFrequency (%)
S 344712
17.6%
N 206484
10.6%
A 201040
10.3%
C 181503
9.3%
U 160381
8.2%
P 159171
8.1%
M 93276
 
4.8%
O 87982
 
4.5%
B 72896
 
3.7%
I 68366
 
3.5%
Other values (20) 378555
19.4%
Other Punctuation
ValueCountFrequency (%)
, 826533
98.7%
. 7811
 
0.9%
' 2815
 
0.3%
? 201
 
< 0.1%
/ 190
 
< 0.1%
* 5
 
< 0.1%
; 1
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
3 108
63.9%
1 24
 
14.2%
2 16
 
9.5%
9 13
 
7.7%
0 8
 
4.7%
Dash Punctuation
ValueCountFrequency (%)
- 30890
99.9%
10
 
< 0.1%
6
 
< 0.1%
Close Punctuation
ValueCountFrequency (%)
] 61158
97.0%
) 1923
 
3.0%
Open Punctuation
ValueCountFrequency (%)
[ 61158
97.0%
( 1923
 
3.0%
Space Separator
ValueCountFrequency (%)
1645630
100.0%
Modifier Letter
ValueCountFrequency (%)
ʻ 813
100.0%
Math Symbol
ValueCountFrequency (%)
= 3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 12211696
82.2%
Common 2641239
 
17.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 1473689
 
12.1%
t 1109174
 
9.1%
e 1085474
 
8.9%
i 1041776
 
8.5%
n 861843
 
7.1%
o 731933
 
6.0%
r 620649
 
5.1%
s 542408
 
4.4%
c 501353
 
4.1%
l 396979
 
3.3%
Other values (66) 3846418
31.5%
Common
ValueCountFrequency (%)
1645630
62.3%
, 826533
31.3%
] 61158
 
2.3%
[ 61158
 
2.3%
- 30890
 
1.2%
. 7811
 
0.3%
' 2815
 
0.1%
) 1923
 
0.1%
( 1923
 
0.1%
ʻ 813
 
< 0.1%
Other values (12) 585
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 14839267
99.9%
None 12839
 
0.1%
Modifier Letters 813
 
< 0.1%
Punctuation 16
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1645630
 
11.1%
a 1473689
 
9.9%
t 1109174
 
7.5%
e 1085474
 
7.3%
i 1041776
 
7.0%
n 861843
 
5.8%
, 826533
 
5.6%
o 731933
 
4.9%
r 620649
 
4.2%
s 542408
 
3.7%
Other values (61) 4900158
33.0%
None
ValueCountFrequency (%)
é 3479
27.1%
í 2115
16.5%
ã 1908
14.9%
Î 1381
 
10.8%
ó 1026
 
8.0%
ā 813
 
6.3%
ç 805
 
6.3%
á 431
 
3.4%
ä 239
 
1.9%
ö 194
 
1.5%
Other values (14) 448
 
3.5%
Modifier Letters
ValueCountFrequency (%)
ʻ 813
100.0%
Punctuation
ValueCountFrequency (%)
10
62.5%
6
37.5%

continent
Text

Missing 

Distinct65
Distinct (%)< 0.1%
Missing144951
Missing (%)42.8%
Memory size2.6 MiB
2025-01-14T11:34:46.805281image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length50
Median length46
Mean length15.26976727
Min length4

Characters and Unicode

Total characters2954532
Distinct characters35
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5 ?
Unique (%)< 0.1%

Sample

1st rowNorth Pacific Ocean
2nd rowSouth Pacific Ocean
3rd rowAsia-Temperate
4th rowNorth Atlantic Ocean
5th rowPacific
ValueCountFrequency (%)
north 93980
21.3%
ocean 69217
15.7%
pacific 66248
15.0%
america 65503
14.8%
atlantic 41364
9.4%
south 31475
 
7.1%
13897
 
3.1%
neotropics 13896
 
3.1%
asia 9238
 
2.1%
africa 8531
 
1.9%
Other values (18) 28368
 
6.4%
2025-01-14T11:34:46.945411image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
c 336991
11.4%
a 291503
 
9.9%
i 287727
 
9.7%
248228
 
8.4%
t 235357
 
8.0%
r 194360
 
6.6%
e 173294
 
5.9%
o 157410
 
5.3%
n 131417
 
4.4%
A 131273
 
4.4%
Other values (25) 766972
26.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2240751
75.8%
Uppercase Letter 433353
 
14.7%
Space Separator 248228
 
8.4%
Dash Punctuation 19474
 
0.7%
Other Punctuation 12726
 
0.4%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
c 336991
15.0%
a 291503
13.0%
i 287727
12.8%
t 235357
10.5%
r 194360
8.7%
e 173294
7.7%
o 157410
7.0%
n 131417
 
5.9%
h 125782
 
5.6%
f 74779
 
3.3%
Other values (9) 232131
10.4%
Uppercase Letter
ValueCountFrequency (%)
A 131273
30.3%
N 107876
24.9%
O 72337
16.7%
P 66248
15.3%
S 31802
 
7.3%
I 8809
 
2.0%
T 5570
 
1.3%
C 5068
 
1.2%
W 2692
 
0.6%
L 635
 
0.1%
Other values (2) 1043
 
0.2%
Other Punctuation
ValueCountFrequency (%)
, 12725
> 99.9%
? 1
 
< 0.1%
Space Separator
ValueCountFrequency (%)
248228
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 19474
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 2674104
90.5%
Common 280428
 
9.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
c 336991
12.6%
a 291503
10.9%
i 287727
10.8%
t 235357
 
8.8%
r 194360
 
7.3%
e 173294
 
6.5%
o 157410
 
5.9%
n 131417
 
4.9%
A 131273
 
4.9%
h 125782
 
4.7%
Other values (21) 608990
22.8%
Common
ValueCountFrequency (%)
248228
88.5%
- 19474
 
6.9%
, 12725
 
4.5%
? 1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2954532
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
c 336991
11.4%
a 291503
 
9.9%
i 287727
 
9.7%
248228
 
8.4%
t 235357
 
8.0%
r 194360
 
6.6%
e 173294
 
5.9%
o 157410
 
5.3%
n 131417
 
4.4%
A 131273
 
4.4%
Other values (25) 766972
26.0%

waterBody
Text

Missing 

Distinct218
Distinct (%)0.2%
Missing231595
Missing (%)68.4%
Memory size2.6 MiB
2025-01-14T11:34:47.112397image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length101
Median length55
Mean length20.41994478
Min length6

Characters and Unicode

Total characters2181769
Distinct characters59
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique18 ?
Unique (%)< 0.1%

Sample

1st rowNorth Pacific Ocean, Gulf of California
2nd rowSouth Pacific Ocean
3rd rowNorth Atlantic Ocean
4th rowPacific
5th rowNorth Pacific Ocean
ValueCountFrequency (%)
ocean 69218
21.1%
pacific 61625
18.8%
north 47128
14.4%
atlantic 41364
12.6%
south 18416
 
5.6%
sea 18259
 
5.6%
caribbean 14747
 
4.5%
bay 12126
 
3.7%
gulf 7272
 
2.2%
of 6755
 
2.1%
Other values (211) 31235
9.5%
2025-01-14T11:34:47.359779image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 263749
12.1%
c 237824
10.9%
221300
 
10.1%
i 195326
 
9.0%
t 153144
 
7.0%
n 144281
 
6.6%
e 133436
 
6.1%
o 88007
 
4.0%
f 78821
 
3.6%
h 75935
 
3.5%
Other values (49) 589946
27.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1599886
73.3%
Uppercase Letter 321575
 
14.7%
Space Separator 221300
 
10.1%
Other Punctuation 38190
 
1.8%
Modifier Letter 813
 
< 0.1%
Open Punctuation 2
 
< 0.1%
Close Punctuation 2
 
< 0.1%
Dash Punctuation 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 263749
16.5%
c 237824
14.9%
i 195326
12.2%
t 153144
9.6%
n 144281
9.0%
e 133436
8.3%
o 88007
 
5.5%
f 78821
 
4.9%
h 75935
 
4.7%
r 70399
 
4.4%
Other values (16) 158964
9.9%
Uppercase Letter
ValueCountFrequency (%)
O 69719
21.7%
P 64694
20.1%
N 47138
14.7%
A 42035
13.1%
S 38955
12.1%
C 22310
 
6.9%
B 14241
 
4.4%
G 7328
 
2.3%
K 4835
 
1.5%
M 3990
 
1.2%
Other values (14) 6330
 
2.0%
Other Punctuation
ValueCountFrequency (%)
, 38110
99.8%
' 73
 
0.2%
. 6
 
< 0.1%
; 1
 
< 0.1%
Space Separator
ValueCountFrequency (%)
221300
100.0%
Modifier Letter
ValueCountFrequency (%)
ʻ 813
100.0%
Open Punctuation
ValueCountFrequency (%)
( 2
100.0%
Close Punctuation
ValueCountFrequency (%)
) 2
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1921461
88.1%
Common 260308
 
11.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 263749
13.7%
c 237824
12.4%
i 195326
10.2%
t 153144
 
8.0%
n 144281
 
7.5%
e 133436
 
6.9%
o 88007
 
4.6%
f 78821
 
4.1%
h 75935
 
4.0%
r 70399
 
3.7%
Other values (40) 480539
25.0%
Common
ValueCountFrequency (%)
221300
85.0%
, 38110
 
14.6%
ʻ 813
 
0.3%
' 73
 
< 0.1%
. 6
 
< 0.1%
( 2
 
< 0.1%
) 2
 
< 0.1%
; 1
 
< 0.1%
- 1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2180143
99.9%
None 813
 
< 0.1%
Modifier Letters 813
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 263749
12.1%
c 237824
10.9%
221300
 
10.2%
i 195326
 
9.0%
t 153144
 
7.0%
n 144281
 
6.6%
e 133436
 
6.1%
o 88007
 
4.0%
f 78821
 
3.6%
h 75935
 
3.5%
Other values (47) 588320
27.0%
None
ValueCountFrequency (%)
ā 813
100.0%
Modifier Letters
ValueCountFrequency (%)
ʻ 813
100.0%

islandGroup
Text

Missing 

Distinct100
Distinct (%)0.4%
Missing315692
Missing (%)93.3%
Memory size2.6 MiB
2025-01-14T11:34:47.477422image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length24
Median length21
Mean length14.50536311
Min length5

Characters and Unicode

Total characters329968
Distinct characters51
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique11 ?
Unique (%)< 0.1%

Sample

1st rowSociety Islands
2nd rowLeeward Antilles
3rd rowBahama Islands
4th rowSociety Islands
5th rowVisayas
ValueCountFrequency (%)
islands 15126
31.9%
society 10385
21.9%
leeward 3586
 
7.6%
antilles 3195
 
6.7%
îles 1364
 
2.9%
vent 1364
 
2.9%
du 1303
 
2.7%
cays 1105
 
2.3%
bahama 991
 
2.1%
group 828
 
1.7%
Other values (103) 8219
17.3%
2025-01-14T11:34:47.831145image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
s 39254
11.9%
a 28836
 
8.7%
e 28070
 
8.5%
24718
 
7.5%
l 24500
 
7.4%
n 22729
 
6.9%
d 21963
 
6.7%
i 17641
 
5.3%
t 16063
 
4.9%
I 15514
 
4.7%
Other values (41) 90680
27.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 257173
77.9%
Uppercase Letter 47716
 
14.5%
Space Separator 24718
 
7.5%
Other Punctuation 361
 
0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
s 39254
15.3%
a 28836
11.2%
e 28070
10.9%
l 24500
9.5%
n 22729
8.8%
d 21963
8.5%
i 17641
6.9%
t 16063
6.2%
o 12693
 
4.9%
y 11946
 
4.6%
Other values (15) 33478
13.0%
Uppercase Letter
ValueCountFrequency (%)
I 15514
32.5%
S 11295
23.7%
L 4429
 
9.3%
A 4273
 
9.0%
V 2164
 
4.5%
B 2064
 
4.3%
C 2048
 
4.3%
Î 1364
 
2.9%
P 926
 
1.9%
G 917
 
1.9%
Other values (13) 2722
 
5.7%
Other Punctuation
ValueCountFrequency (%)
. 353
97.8%
' 8
 
2.2%
Space Separator
ValueCountFrequency (%)
24718
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 304889
92.4%
Common 25079
 
7.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
s 39254
12.9%
a 28836
 
9.5%
e 28070
 
9.2%
l 24500
 
8.0%
n 22729
 
7.5%
d 21963
 
7.2%
i 17641
 
5.8%
t 16063
 
5.3%
I 15514
 
5.1%
o 12693
 
4.2%
Other values (38) 77626
25.5%
Common
ValueCountFrequency (%)
24718
98.6%
. 353
 
1.4%
' 8
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 328604
99.6%
None 1364
 
0.4%

Most frequent character per block

ASCII
ValueCountFrequency (%)
s 39254
11.9%
a 28836
 
8.8%
e 28070
 
8.5%
24718
 
7.5%
l 24500
 
7.5%
n 22729
 
6.9%
d 21963
 
6.7%
i 17641
 
5.4%
t 16063
 
4.9%
I 15514
 
4.7%
Other values (40) 89316
27.2%
None
ValueCountFrequency (%)
Î 1364
100.0%

island
Text

Missing 

Distinct566
Distinct (%)1.0%
Missing279541
Missing (%)82.6%
Memory size2.6 MiB
2025-01-14T11:34:48.013890image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length31
Median length25
Mean length8.431569297
Min length3

Characters and Unicode

Total characters496611
Distinct characters62
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique36 ?
Unique (%)0.1%

Sample

1st rowMoorea
2nd rowMoorea
3rd rowMindanao
4th rowKlein Curacao
5th rowMoorea
ValueCountFrequency (%)
moorea 15960
18.5%
cay 7347
 
8.5%
carrie 4788
 
5.5%
bow 4788
 
5.5%
island 4068
 
4.7%
curacao 3681
 
4.3%
oahu 2250
 
2.6%
luzon 2090
 
2.4%
borneo 2047
 
2.4%
atoll 915
 
1.1%
Other values (560) 38504
44.5%
2025-01-14T11:34:48.266598image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 78616
15.8%
o 63718
12.8%
r 44941
 
9.0%
e 38317
 
7.7%
27539
 
5.5%
u 21082
 
4.2%
n 20978
 
4.2%
i 20856
 
4.2%
C 19943
 
4.0%
M 19709
 
4.0%
Other values (52) 140912
28.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 380361
76.6%
Uppercase Letter 86167
 
17.4%
Space Separator 27539
 
5.5%
Close Punctuation 802
 
0.2%
Open Punctuation 802
 
0.2%
Other Punctuation 782
 
0.2%
Dash Punctuation 158
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 78616
20.7%
o 63718
16.8%
r 44941
11.8%
e 38317
10.1%
u 21082
 
5.5%
n 20978
 
5.5%
i 20856
 
5.5%
l 12017
 
3.2%
s 11399
 
3.0%
y 10671
 
2.8%
Other values (19) 57766
15.2%
Uppercase Letter
ValueCountFrequency (%)
C 19943
23.1%
M 19709
22.9%
B 8939
10.4%
I 4682
 
5.4%
T 4468
 
5.2%
S 3549
 
4.1%
L 3261
 
3.8%
H 2578
 
3.0%
O 2571
 
3.0%
P 2470
 
2.9%
Other values (16) 13997
16.2%
Other Punctuation
ValueCountFrequency (%)
' 707
90.4%
. 73
 
9.3%
, 2
 
0.3%
Space Separator
ValueCountFrequency (%)
27539
100.0%
Close Punctuation
ValueCountFrequency (%)
] 802
100.0%
Open Punctuation
ValueCountFrequency (%)
[ 802
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 158
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 466528
93.9%
Common 30083
 
6.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 78616
16.9%
o 63718
13.7%
r 44941
 
9.6%
e 38317
 
8.2%
u 21082
 
4.5%
n 20978
 
4.5%
i 20856
 
4.5%
C 19943
 
4.3%
M 19709
 
4.2%
l 12017
 
2.6%
Other values (45) 126351
27.1%
Common
ValueCountFrequency (%)
27539
91.5%
] 802
 
2.7%
[ 802
 
2.7%
' 707
 
2.4%
- 158
 
0.5%
. 73
 
0.2%
, 2
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 496161
99.9%
None 450
 
0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 78616
15.8%
o 63718
12.8%
r 44941
 
9.1%
e 38317
 
7.7%
27539
 
5.6%
u 21082
 
4.2%
n 20978
 
4.2%
i 20856
 
4.2%
C 19943
 
4.0%
M 19709
 
4.0%
Other values (47) 140462
28.3%
None
ValueCountFrequency (%)
ç 380
84.4%
ó 34
 
7.6%
ò 19
 
4.2%
Î 14
 
3.1%
Ž 3
 
0.7%

country
Text

Missing 

Distinct244
Distinct (%)0.1%
Missing14430
Missing (%)4.3%
Memory size2.6 MiB
2025-01-14T11:34:48.461238image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length44
Median length36
Mean length11.00677448
Min length4

Characters and Unicode

Total characters3566305
Distinct characters66
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique8 ?
Unique (%)< 0.1%

Sample

1st rowUnited States
2nd rowMexico
3rd rowFrench Polynesia
4th rowUnited States
5th rowChina
ValueCountFrequency (%)
states 150853
28.1%
united 150769
28.1%
french 23145
 
4.3%
polynesia 22963
 
4.3%
mexico 9713
 
1.8%
panama 9216
 
1.7%
belize 9195
 
1.7%
philippines 6781
 
1.3%
guyana 5999
 
1.1%
new 5306
 
1.0%
Other values (266) 142214
26.5%
2025-01-14T11:34:48.735377image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
t 480575
13.5%
e 435115
12.2%
a 385179
10.8%
i 291078
 
8.2%
n 286223
 
8.0%
212144
 
5.9%
s 206350
 
5.8%
d 173347
 
4.9%
S 162516
 
4.6%
U 152569
 
4.3%
Other values (56) 781209
21.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2819834
79.1%
Uppercase Letter 531078
 
14.9%
Space Separator 212144
 
5.9%
Other Punctuation 2066
 
0.1%
Dash Punctuation 595
 
< 0.1%
Close Punctuation 294
 
< 0.1%
Open Punctuation 294
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t 480575
17.0%
e 435115
15.4%
a 385179
13.7%
i 291078
10.3%
n 286223
10.2%
s 206350
7.3%
d 173347
 
6.1%
o 77020
 
2.7%
r 74598
 
2.6%
l 73008
 
2.6%
Other values (20) 337341
12.0%
Uppercase Letter
ValueCountFrequency (%)
S 162516
30.6%
U 152569
28.7%
P 49504
 
9.3%
F 26660
 
5.0%
C 22198
 
4.2%
B 21697
 
4.1%
M 21674
 
4.1%
G 14861
 
2.8%
A 8850
 
1.7%
T 8420
 
1.6%
Other values (15) 42129
 
7.9%
Other Punctuation
ValueCountFrequency (%)
, 1670
80.8%
. 280
 
13.6%
' 65
 
3.1%
? 50
 
2.4%
/ 1
 
< 0.1%
Close Punctuation
ValueCountFrequency (%)
) 240
81.6%
] 54
 
18.4%
Open Punctuation
ValueCountFrequency (%)
( 240
81.6%
[ 54
 
18.4%
Space Separator
ValueCountFrequency (%)
212144
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 595
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 3350912
94.0%
Common 215393
 
6.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
t 480575
14.3%
e 435115
13.0%
a 385179
11.5%
i 291078
8.7%
n 286223
8.5%
s 206350
 
6.2%
d 173347
 
5.2%
S 162516
 
4.8%
U 152569
 
4.6%
o 77020
 
2.3%
Other values (45) 700940
20.9%
Common
ValueCountFrequency (%)
212144
98.5%
, 1670
 
0.8%
- 595
 
0.3%
. 280
 
0.1%
) 240
 
0.1%
( 240
 
0.1%
' 65
 
< 0.1%
[ 54
 
< 0.1%
] 54
 
< 0.1%
? 50
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3563223
99.9%
None 3082
 
0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
t 480575
13.5%
e 435115
12.2%
a 385179
10.8%
i 291078
 
8.2%
n 286223
 
8.0%
212144
 
6.0%
s 206350
 
5.8%
d 173347
 
4.9%
S 162516
 
4.6%
U 152569
 
4.3%
Other values (52) 778127
21.8%
None
ValueCountFrequency (%)
é 912
29.6%
í 885
28.7%
ã 885
28.7%
ç 400
13.0%

stateProvince
Text

Missing 

Distinct1646
Distinct (%)0.6%
Missing66214
Missing (%)19.6%
Memory size2.6 MiB
2025-01-14T11:34:48.935261image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length52
Median length42
Mean length9.616215938
Min length3

Characters and Unicode

Total characters2617784
Distinct characters82
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique68 ?
Unique (%)< 0.1%

Sample

1st rowArizona
2nd rowArkansas
3rd rowXizang
4th rowLaikipia
5th rowFlorida
ValueCountFrequency (%)
california 17069
 
4.6%
florida 16485
 
4.4%
texas 14332
 
3.9%
virginia 13045
 
3.5%
not 10639
 
2.9%
stated 10639
 
2.9%
arizona 9691
 
2.6%
carolina 8854
 
2.4%
region 8372
 
2.3%
new 8072
 
2.2%
Other values (1667) 253746
68.4%
2025-01-14T11:34:49.208191image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 361759
13.8%
i 257226
 
9.8%
n 192597
 
7.4%
o 191176
 
7.3%
r 175302
 
6.7%
e 143640
 
5.5%
s 116938
 
4.5%
t 109100
 
4.2%
l 104596
 
4.0%
98718
 
3.8%
Other values (72) 866732
33.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2114743
80.8%
Uppercase Letter 371148
 
14.2%
Space Separator 98718
 
3.8%
Open Punctuation 10915
 
0.4%
Close Punctuation 10915
 
0.4%
Dash Punctuation 8617
 
0.3%
Other Punctuation 2607
 
0.1%
Decimal Number 121
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 361759
17.1%
i 257226
12.2%
n 192597
9.1%
o 191176
9.0%
r 175302
8.3%
e 143640
 
6.8%
s 116938
 
5.5%
t 109100
 
5.2%
l 104596
 
4.9%
u 68939
 
3.3%
Other values (31) 393470
18.6%
Uppercase Letter
ValueCountFrequency (%)
C 47952
12.9%
T 35494
 
9.6%
S 34273
 
9.2%
N 33681
 
9.1%
M 32065
 
8.6%
A 24621
 
6.6%
F 18726
 
5.0%
V 16298
 
4.4%
P 16190
 
4.4%
I 12017
 
3.2%
Other values (18) 99831
26.9%
Other Punctuation
ValueCountFrequency (%)
. 2199
84.3%
' 211
 
8.1%
/ 93
 
3.6%
, 61
 
2.3%
? 43
 
1.6%
Open Punctuation
ValueCountFrequency (%)
[ 10615
97.3%
( 300
 
2.7%
Close Punctuation
ValueCountFrequency (%)
] 10615
97.3%
) 300
 
2.7%
Decimal Number
ValueCountFrequency (%)
3 108
89.3%
9 13
 
10.7%
Space Separator
ValueCountFrequency (%)
98718
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 8617
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 2485891
95.0%
Common 131893
 
5.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 361759
14.6%
i 257226
 
10.3%
n 192597
 
7.7%
o 191176
 
7.7%
r 175302
 
7.1%
e 143640
 
5.8%
s 116938
 
4.7%
t 109100
 
4.4%
l 104596
 
4.2%
u 68939
 
2.8%
Other values (59) 764618
30.8%
Common
ValueCountFrequency (%)
98718
74.8%
[ 10615
 
8.0%
] 10615
 
8.0%
- 8617
 
6.5%
. 2199
 
1.7%
( 300
 
0.2%
) 300
 
0.2%
' 211
 
0.2%
3 108
 
0.1%
/ 93
 
0.1%
Other values (3) 117
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2611543
99.8%
None 6241
 
0.2%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 361759
13.9%
i 257226
 
9.8%
n 192597
 
7.4%
o 191176
 
7.3%
r 175302
 
6.7%
e 143640
 
5.5%
s 116938
 
4.5%
t 109100
 
4.2%
l 104596
 
4.0%
98718
 
3.8%
Other values (55) 860491
32.9%
None
ValueCountFrequency (%)
é 2427
38.9%
ã 978
15.7%
ó 951
 
15.2%
í 870
 
13.9%
á 390
 
6.2%
ä 239
 
3.8%
ö 185
 
3.0%
ñ 88
 
1.4%
ô 45
 
0.7%
ü 17
 
0.3%
Other values (7) 51
 
0.8%

county
Text

Missing 

Distinct3053
Distinct (%)1.5%
Missing140615
Missing (%)41.5%
Memory size2.6 MiB
2025-01-14T11:34:49.401964image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length56
Median length35
Mean length10.83488437
Min length1

Characters and Unicode

Total characters2143411
Distinct characters83
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique295 ?
Unique (%)0.1%

Sample

1st rowCochise
2nd rowNielamu (Nyalam) Xian
3rd row[Not Stated]
4th row[Not Stated]
5th row[Not Stated]
ValueCountFrequency (%)
not 49678
 
15.0%
stated 49678
 
15.0%
county 38512
 
11.6%
honolulu 5036
 
1.5%
san 4616
 
1.4%
st 3591
 
1.1%
cochise 3342
 
1.0%
lucie 3228
 
1.0%
island 2684
 
0.8%
xian 2352
 
0.7%
Other values (2542) 169101
51.0%
2025-01-14T11:34:49.666274image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
t 236941
 
11.1%
o 188038
 
8.8%
a 183246
 
8.5%
e 150194
 
7.0%
n 138867
 
6.5%
133993
 
6.3%
u 85315
 
4.0%
i 84625
 
3.9%
d 76684
 
3.6%
r 76363
 
3.6%
Other values (73) 789145
36.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1568693
73.2%
Uppercase Letter 330147
 
15.4%
Space Separator 133993
 
6.3%
Open Punctuation 50909
 
2.4%
Close Punctuation 50909
 
2.4%
Other Punctuation 6653
 
0.3%
Dash Punctuation 2056
 
0.1%
Decimal Number 48
 
< 0.1%
Math Symbol 3
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t 236941
15.1%
o 188038
12.0%
a 183246
11.7%
e 150194
9.6%
n 138867
8.9%
u 85315
 
5.4%
i 84625
 
5.4%
d 76684
 
4.9%
r 76363
 
4.9%
l 60301
 
3.8%
Other values (28) 288119
18.4%
Uppercase Letter
ValueCountFrequency (%)
S 73008
22.1%
C 59889
18.1%
N 54961
16.6%
H 14159
 
4.3%
B 13887
 
4.2%
M 13874
 
4.2%
P 13324
 
4.0%
L 12997
 
3.9%
A 12173
 
3.7%
D 9876
 
3.0%
Other values (18) 51999
15.8%
Other Punctuation
ValueCountFrequency (%)
. 4074
61.2%
' 1745
26.2%
, 626
 
9.4%
? 107
 
1.6%
/ 96
 
1.4%
* 5
 
0.1%
Decimal Number
ValueCountFrequency (%)
1 24
50.0%
2 16
33.3%
0 8
 
16.7%
Open Punctuation
ValueCountFrequency (%)
[ 49687
97.6%
( 1222
 
2.4%
Close Punctuation
ValueCountFrequency (%)
] 49687
97.6%
) 1222
 
2.4%
Dash Punctuation
ValueCountFrequency (%)
- 2046
99.5%
10
 
0.5%
Space Separator
ValueCountFrequency (%)
133993
100.0%
Math Symbol
ValueCountFrequency (%)
= 3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1898840
88.6%
Common 244571
 
11.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
t 236941
 
12.5%
o 188038
 
9.9%
a 183246
 
9.7%
e 150194
 
7.9%
n 138867
 
7.3%
u 85315
 
4.5%
i 84625
 
4.5%
d 76684
 
4.0%
r 76363
 
4.0%
S 73008
 
3.8%
Other values (56) 605559
31.9%
Common
ValueCountFrequency (%)
133993
54.8%
[ 49687
 
20.3%
] 49687
 
20.3%
. 4074
 
1.7%
- 2046
 
0.8%
' 1745
 
0.7%
) 1222
 
0.5%
( 1222
 
0.5%
, 626
 
0.3%
? 107
 
< 0.1%
Other values (7) 162
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2142524
> 99.9%
None 877
 
< 0.1%
Punctuation 10
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
t 236941
 
11.1%
o 188038
 
8.8%
a 183246
 
8.6%
e 150194
 
7.0%
n 138867
 
6.5%
133993
 
6.3%
u 85315
 
4.0%
i 84625
 
3.9%
d 76684
 
3.6%
r 76363
 
3.6%
Other values (58) 788258
36.8%
None
ValueCountFrequency (%)
í 360
41.0%
ü 153
17.4%
é 137
 
15.6%
ã 45
 
5.1%
á 41
 
4.7%
ó 38
 
4.3%
â 32
 
3.6%
ç 25
 
2.9%
ô 15
 
1.7%
ö 9
 
1.0%
Other values (4) 22
 
2.5%
Punctuation
ValueCountFrequency (%)
10
100.0%

locality
Text

Missing 

Distinct31947
Distinct (%)10.5%
Missing34082
Missing (%)10.1%
Memory size2.6 MiB
2025-01-14T11:34:49.865646image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length312
Median length249
Mean length40.81947246
Min length3

Characters and Unicode

Total characters12423733
Distinct characters133
Distinct categories17 ?
Distinct scripts2 ?
Distinct blocks7 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4476 ?
Unique (%)1.5%

Sample

1st rowCarr Canyon, Huachuca Mountains
2nd rowSociety Islands, Moorea, In front of Hilton
3rd rowAshdown
4th rowNielamu Zhen. Route 318 between Zhangmu and Nielamu (Nyalam) ca. 8 km from Zhangmu.
5th rowMpala Research Centre
ValueCountFrequency (%)
of 95516
 
4.7%
km 27914
 
1.4%
road 26009
 
1.3%
on 20789
 
1.0%
island 19636
 
1.0%
and 19477
 
1.0%
national 18168
 
0.9%
river 17531
 
0.9%
creek 15261
 
0.8%
at 14878
 
0.7%
Other values (27243) 1757365
86.5%
2025-01-14T11:34:50.149628image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1728186
 
13.9%
a 1103311
 
8.9%
e 889287
 
7.2%
o 819763
 
6.6%
n 662428
 
5.3%
i 647781
 
5.2%
r 607866
 
4.9%
t 592552
 
4.8%
l 449386
 
3.6%
s 433787
 
3.5%
Other values (123) 4489386
36.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 8496745
68.4%
Space Separator 1728186
 
13.9%
Uppercase Letter 1412552
 
11.4%
Other Punctuation 436235
 
3.5%
Decimal Number 259007
 
2.1%
Close Punctuation 32228
 
0.3%
Open Punctuation 32214
 
0.3%
Dash Punctuation 20771
 
0.2%
Other Symbol 2900
 
< 0.1%
Math Symbol 1877
 
< 0.1%
Other values (7) 1018
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 1103311
13.0%
e 889287
10.5%
o 819763
9.6%
n 662428
 
7.8%
i 647781
 
7.6%
r 607866
 
7.2%
t 592552
 
7.0%
l 449386
 
5.3%
s 433787
 
5.1%
u 304184
 
3.6%
Other values (44) 1986400
23.4%
Uppercase Letter
ValueCountFrequency (%)
S 160294
 
11.3%
C 145463
 
10.3%
M 103816
 
7.3%
B 101782
 
7.2%
R 99908
 
7.1%
P 98660
 
7.0%
N 89546
 
6.3%
I 60545
 
4.3%
A 59405
 
4.2%
L 54638
 
3.9%
Other values (24) 438495
31.0%
Other Punctuation
ValueCountFrequency (%)
, 297955
68.3%
. 103321
 
23.7%
' 10613
 
2.4%
; 7920
 
1.8%
" 4321
 
1.0%
: 4150
 
1.0%
/ 3514
 
0.8%
# 2998
 
0.7%
& 670
 
0.2%
@ 609
 
0.1%
Other values (2) 164
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
1 50613
19.5%
2 34542
13.3%
0 34120
13.2%
5 31147
12.0%
3 25472
9.8%
4 21094
8.1%
6 17842
 
6.9%
7 16325
 
6.3%
9 14534
 
5.6%
8 13318
 
5.1%
Math Symbol
ValueCountFrequency (%)
= 1277
68.0%
~ 431
 
23.0%
+ 123
 
6.6%
> 35
 
1.9%
< 8
 
0.4%
| 3
 
0.2%
Open Punctuation
ValueCountFrequency (%)
( 26329
81.7%
[ 5884
 
18.3%
1
 
< 0.1%
Close Punctuation
ValueCountFrequency (%)
) 26344
81.7%
] 5884
 
18.3%
Dash Punctuation
ValueCountFrequency (%)
- 20763
> 99.9%
8
 
< 0.1%
Other Symbol
ValueCountFrequency (%)
° 2897
99.9%
3
 
0.1%
Space Separator
ValueCountFrequency (%)
1728186
100.0%
Modifier Letter
ValueCountFrequency (%)
ʻ 813
100.0%
Other Letter
ValueCountFrequency (%)
º 158
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 23
100.0%
Final Punctuation
ValueCountFrequency (%)
10
100.0%
Initial Punctuation
ValueCountFrequency (%)
6
100.0%
Other Number
ValueCountFrequency (%)
¼ 5
100.0%
Currency Symbol
ValueCountFrequency (%)
3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 9909455
79.8%
Common 2514278
 
20.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 1103311
 
11.1%
e 889287
 
9.0%
o 819763
 
8.3%
n 662428
 
6.7%
i 647781
 
6.5%
r 607866
 
6.1%
t 592552
 
6.0%
l 449386
 
4.5%
s 433787
 
4.4%
u 304184
 
3.1%
Other values (79) 3399110
34.3%
Common
ValueCountFrequency (%)
1728186
68.7%
, 297955
 
11.9%
. 103321
 
4.1%
1 50613
 
2.0%
2 34542
 
1.4%
0 34120
 
1.4%
5 31147
 
1.2%
) 26344
 
1.0%
( 26329
 
1.0%
3 25472
 
1.0%
Other values (34) 156249
 
6.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 12412648
99.9%
None 10099
 
0.1%
Modifier Letters 813
 
< 0.1%
Latin Ext Additional 142
 
< 0.1%
Punctuation 25
 
< 0.1%
Currency Symbols 3
 
< 0.1%
Letterlike Symbols 3
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1728186
 
13.9%
a 1103311
 
8.9%
e 889287
 
7.2%
o 819763
 
6.6%
n 662428
 
5.3%
i 647781
 
5.2%
r 607866
 
4.9%
t 592552
 
4.8%
l 449386
 
3.6%
s 433787
 
3.5%
Other values (77) 4478301
36.1%
None
ValueCountFrequency (%)
° 2897
28.7%
è 1904
18.9%
é 1027
 
10.2%
í 1025
 
10.1%
ā 813
 
8.1%
á 677
 
6.7%
ó 376
 
3.7%
ô 224
 
2.2%
ã 207
 
2.0%
ñ 168
 
1.7%
Other values (24) 781
 
7.7%
Modifier Letters
ValueCountFrequency (%)
ʻ 813
100.0%
Latin Ext Additional
ValueCountFrequency (%)
56
39.4%
56
39.4%
10
 
7.0%
10
 
7.0%
10
 
7.0%
Punctuation
ValueCountFrequency (%)
10
40.0%
8
32.0%
6
24.0%
1
 
4.0%
Currency Symbols
ValueCountFrequency (%)
3
100.0%
Letterlike Symbols
ValueCountFrequency (%)
3
100.0%
Distinct2610
Distinct (%)2.9%
Missing249251
Missing (%)73.6%
Memory size2.6 MiB
2025-01-14T11:34:50.353904image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length7
Median length6
Mean length5.173474307
Min length3

Characters and Unicode

Total characters461417
Distinct characters12
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique256 ?
Unique (%)0.3%

Sample

1st row1524.0
2nd row2700.0
3rd row1800.0
4th row1000.0
5th row760.0
ValueCountFrequency (%)
5.0 1674
 
1.9%
1100.0 1169
 
1.3%
150.0 1080
 
1.2%
200.0 1047
 
1.2%
1200.0 1002
 
1.1%
50.0 851
 
1.0%
10.0 831
 
0.9%
1829.0 752
 
0.8%
100.0 735
 
0.8%
1487.0 633
 
0.7%
Other values (2597) 79415
89.0%
2025-01-14T11:34:50.631870image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 142356
30.9%
. 89189
19.3%
1 52311
 
11.3%
2 35256
 
7.6%
5 30778
 
6.7%
3 22639
 
4.9%
4 21024
 
4.6%
7 18460
 
4.0%
6 17046
 
3.7%
8 16799
 
3.6%
Other values (2) 15559
 
3.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 372216
80.7%
Other Punctuation 89189
 
19.3%
Dash Punctuation 12
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 142356
38.2%
1 52311
 
14.1%
2 35256
 
9.5%
5 30778
 
8.3%
3 22639
 
6.1%
4 21024
 
5.6%
7 18460
 
5.0%
6 17046
 
4.6%
8 16799
 
4.5%
9 15547
 
4.2%
Other Punctuation
ValueCountFrequency (%)
. 89189
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 12
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 461417
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 142356
30.9%
. 89189
19.3%
1 52311
 
11.3%
2 35256
 
7.6%
5 30778
 
6.7%
3 22639
 
4.9%
4 21024
 
4.6%
7 18460
 
4.0%
6 17046
 
3.7%
8 16799
 
3.6%
Other values (2) 15559
 
3.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 461417
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 142356
30.9%
. 89189
19.3%
1 52311
 
11.3%
2 35256
 
7.6%
5 30778
 
6.7%
3 22639
 
4.9%
4 21024
 
4.6%
7 18460
 
4.0%
6 17046
 
3.7%
8 16799
 
3.6%
Other values (2) 15559
 
3.4%
Distinct1577
Distinct (%)2.9%
Missing284628
Missing (%)84.1%
Memory size2.6 MiB
2025-01-14T11:34:50.838960image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length7
Median length6
Mean length5.241916301
Min length3

Characters and Unicode

Total characters282078
Distinct characters12
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique152 ?
Unique (%)0.3%

Sample

1st row1524.0
2nd row1800.0
3rd row1000.0
4th row760.0
5th row650.0
ValueCountFrequency (%)
1200.0 1121
 
2.1%
1100.0 864
 
1.6%
15.0 845
 
1.6%
200.0 770
 
1.4%
1829.0 742
 
1.4%
50.0 644
 
1.2%
1487.0 633
 
1.2%
800.0 616
 
1.1%
1707.0 575
 
1.1%
1800.0 550
 
1.0%
Other values (1564) 46452
86.3%
2025-01-14T11:34:51.107680image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 91330
32.4%
. 53812
19.1%
1 31719
 
11.2%
2 20464
 
7.3%
5 18027
 
6.4%
4 13458
 
4.8%
3 12000
 
4.3%
7 10831
 
3.8%
8 10244
 
3.6%
6 10141
 
3.6%
Other values (2) 10052
 
3.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 228254
80.9%
Other Punctuation 53812
 
19.1%
Dash Punctuation 12
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 91330
40.0%
1 31719
 
13.9%
2 20464
 
9.0%
5 18027
 
7.9%
4 13458
 
5.9%
3 12000
 
5.3%
7 10831
 
4.7%
8 10244
 
4.5%
6 10141
 
4.4%
9 10040
 
4.4%
Other Punctuation
ValueCountFrequency (%)
. 53812
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 12
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 282078
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 91330
32.4%
. 53812
19.1%
1 31719
 
11.2%
2 20464
 
7.3%
5 18027
 
6.4%
4 13458
 
4.8%
3 12000
 
4.3%
7 10831
 
3.8%
8 10244
 
3.6%
6 10141
 
3.6%
Other values (2) 10052
 
3.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 282078
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 91330
32.4%
. 53812
19.1%
1 31719
 
11.2%
2 20464
 
7.3%
5 18027
 
6.4%
4 13458
 
4.8%
3 12000
 
4.3%
7 10831
 
3.8%
8 10244
 
3.6%
6 10141
 
3.6%
Other values (2) 10052
 
3.6%

verbatimElevation
Text

Missing 

Distinct913
Distinct (%)5.7%
Missing322501
Missing (%)95.3%
Memory size2.6 MiB
2025-01-14T11:34:51.294626image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length33
Median length27
Mean length6.528075789
Min length1

Characters and Unicode

Total characters104051
Distinct characters47
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique164 ?
Unique (%)1.0%

Sample

1st row760 m
2nd row1050 ft
3rd row611 m
4th row73 m
5th row500 ft
ValueCountFrequency (%)
m 8076
23.4%
ft 7364
21.3%
ca 906
 
2.6%
503
 
1.5%
50 384
 
1.1%
3440 336
 
1.0%
sea 323
 
0.9%
level 323
 
0.9%
54 314
 
0.9%
80 302
 
0.9%
Other values (758) 15667
45.4%
2025-01-14T11:34:51.552647image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
18559
17.8%
0 15811
15.2%
m 8249
 
7.9%
t 7841
 
7.5%
f 7467
 
7.2%
1 5481
 
5.3%
5 4891
 
4.7%
4 4550
 
4.4%
3 4549
 
4.4%
2 4235
 
4.1%
Other values (37) 22418
21.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 50353
48.4%
Lowercase Letter 31713
30.5%
Space Separator 18559
 
17.8%
Other Punctuation 1210
 
1.2%
Dash Punctuation 1026
 
1.0%
Uppercase Letter 596
 
0.6%
Math Symbol 364
 
0.3%
Open Punctuation 115
 
0.1%
Close Punctuation 115
 
0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
m 8249
26.0%
t 7841
24.7%
f 7467
23.5%
a 1699
 
5.4%
e 1622
 
5.1%
c 1235
 
3.9%
l 723
 
2.3%
s 424
 
1.3%
r 422
 
1.3%
v 408
 
1.3%
Other values (12) 1623
 
5.1%
Decimal Number
ValueCountFrequency (%)
0 15811
31.4%
1 5481
 
10.9%
5 4891
 
9.7%
4 4550
 
9.0%
3 4549
 
9.0%
2 4235
 
8.4%
6 3505
 
7.0%
8 3285
 
6.5%
7 2223
 
4.4%
9 1823
 
3.6%
Other Punctuation
ValueCountFrequency (%)
. 1123
92.8%
/ 70
 
5.8%
? 12
 
1.0%
' 4
 
0.3%
, 1
 
0.1%
Uppercase Letter
ValueCountFrequency (%)
S 197
33.1%
P 195
32.7%
G 195
32.7%
L 9
 
1.5%
Math Symbol
ValueCountFrequency (%)
< 294
80.8%
+ 70
 
19.2%
Space Separator
ValueCountFrequency (%)
18559
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1026
100.0%
Open Punctuation
ValueCountFrequency (%)
( 115
100.0%
Close Punctuation
ValueCountFrequency (%)
) 115
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 71742
68.9%
Latin 32309
31.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
m 8249
25.5%
t 7841
24.3%
f 7467
23.1%
a 1699
 
5.3%
e 1622
 
5.0%
c 1235
 
3.8%
l 723
 
2.2%
s 424
 
1.3%
r 422
 
1.3%
v 408
 
1.3%
Other values (16) 2219
 
6.9%
Common
ValueCountFrequency (%)
18559
25.9%
0 15811
22.0%
1 5481
 
7.6%
5 4891
 
6.8%
4 4550
 
6.3%
3 4549
 
6.3%
2 4235
 
5.9%
6 3505
 
4.9%
8 3285
 
4.6%
7 2223
 
3.1%
Other values (11) 4653
 
6.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 104051
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
18559
17.8%
0 15811
15.2%
m 8249
 
7.9%
t 7841
 
7.5%
f 7467
 
7.2%
1 5481
 
5.3%
5 4891
 
4.7%
4 4550
 
4.4%
3 4549
 
4.4%
2 4235
 
4.1%
Other values (37) 22418
21.5%

minimumDepthInMeters
Text

Missing 

Distinct2030
Distinct (%)2.7%
Missing264207
Missing (%)78.1%
Memory size2.6 MiB
2025-01-14T11:34:51.745303image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length26
Median length25
Mean length4.129928738
Min length3

Characters and Unicode

Total characters306577
Distinct characters36
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique626 ?
Unique (%)0.8%

Sample

1st row1785.34
2nd row13.0
3rd row0.0
4th row25.0
5th row3456.48
ValueCountFrequency (%)
0.0 13845
 
18.6%
1.0 5900
 
7.9%
3.0 3920
 
5.3%
0.5 2538
 
3.4%
2.0 2371
 
3.2%
10.0 2026
 
2.7%
15.0 1882
 
2.5%
1.5 1338
 
1.8%
12.0 1313
 
1.8%
5.0 1309
 
1.8%
Other values (2018) 37795
50.9%
2025-01-14T11:34:51.992505image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 91771
29.9%
. 74229
24.2%
1 33738
 
11.0%
2 22613
 
7.4%
5 20042
 
6.5%
3 16696
 
5.4%
6 10912
 
3.6%
7 9741
 
3.2%
8 9557
 
3.1%
9 8666
 
2.8%
Other values (26) 8612
 
2.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 232211
75.7%
Other Punctuation 74229
 
24.2%
Lowercase Letter 75
 
< 0.1%
Dash Punctuation 54
 
< 0.1%
Space Separator 4
 
< 0.1%
Uppercase Letter 4
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 11
14.7%
o 8
10.7%
r 7
9.3%
e 6
 
8.0%
l 6
 
8.0%
i 6
 
8.0%
s 5
 
6.7%
h 4
 
5.3%
m 4
 
5.3%
t 3
 
4.0%
Other values (9) 15
20.0%
Decimal Number
ValueCountFrequency (%)
0 91771
39.5%
1 33738
 
14.5%
2 22613
 
9.7%
5 20042
 
8.6%
3 16696
 
7.2%
6 10912
 
4.7%
7 9741
 
4.2%
8 9557
 
4.1%
9 8666
 
3.7%
4 8475
 
3.6%
Uppercase Letter
ValueCountFrequency (%)
P 1
25.0%
C 1
25.0%
O 1
25.0%
M 1
25.0%
Other Punctuation
ValueCountFrequency (%)
. 74229
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 54
100.0%
Space Separator
ValueCountFrequency (%)
4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 306498
> 99.9%
Latin 79
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 11
13.9%
o 8
10.1%
r 7
 
8.9%
e 6
 
7.6%
l 6
 
7.6%
i 6
 
7.6%
s 5
 
6.3%
h 4
 
5.1%
m 4
 
5.1%
t 3
 
3.8%
Other values (13) 19
24.1%
Common
ValueCountFrequency (%)
0 91771
29.9%
. 74229
24.2%
1 33738
 
11.0%
2 22613
 
7.4%
5 20042
 
6.5%
3 16696
 
5.4%
6 10912
 
3.6%
7 9741
 
3.2%
8 9557
 
3.1%
9 8666
 
2.8%
Other values (3) 8533
 
2.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 306577
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 91771
29.9%
. 74229
24.2%
1 33738
 
11.0%
2 22613
 
7.4%
5 20042
 
6.5%
3 16696
 
5.4%
6 10912
 
3.6%
7 9741
 
3.2%
8 9557
 
3.1%
9 8666
 
2.8%
Other values (26) 8612
 
2.8%

maximumDepthInMeters
Text

Missing 

Distinct1921
Distinct (%)2.9%
Missing271190
Missing (%)80.1%
Memory size2.6 MiB
2025-01-14T11:34:52.188986image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length7
Mean length4.119048327
Min length3

Characters and Unicode

Total characters277006
Distinct characters12
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique561 ?
Unique (%)0.8%

Sample

1st row1785.34
2nd row17.0
3rd row98.0
4th row35.0
5th row3456.48
ValueCountFrequency (%)
1.0 6735
 
10.0%
3.0 5343
 
7.9%
2.0 3528
 
5.2%
5.0 2501
 
3.7%
0.5 1769
 
2.6%
10.0 1694
 
2.5%
1.5 1506
 
2.2%
20.0 1483
 
2.2%
18.0 1428
 
2.1%
12.0 1153
 
1.7%
Other values (1905) 40110
59.6%
2025-01-14T11:34:52.443908image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 71150
25.7%
. 67250
24.3%
1 34373
12.4%
2 21363
 
7.7%
5 18352
 
6.6%
3 17780
 
6.4%
8 10244
 
3.7%
6 9269
 
3.3%
9 9185
 
3.3%
7 9177
 
3.3%
Other values (2) 8863
 
3.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 209702
75.7%
Other Punctuation 67250
 
24.3%
Dash Punctuation 54
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 71150
33.9%
1 34373
16.4%
2 21363
 
10.2%
5 18352
 
8.8%
3 17780
 
8.5%
8 10244
 
4.9%
6 9269
 
4.4%
9 9185
 
4.4%
7 9177
 
4.4%
4 8809
 
4.2%
Other Punctuation
ValueCountFrequency (%)
. 67250
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 54
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 277006
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 71150
25.7%
. 67250
24.3%
1 34373
12.4%
2 21363
 
7.7%
5 18352
 
6.6%
3 17780
 
6.4%
8 10244
 
3.7%
6 9269
 
3.3%
9 9185
 
3.3%
7 9177
 
3.3%
Other values (2) 8863
 
3.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 277006
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 71150
25.7%
. 67250
24.3%
1 34373
12.4%
2 21363
 
7.7%
5 18352
 
6.6%
3 17780
 
6.4%
8 10244
 
3.7%
6 9269
 
3.3%
9 9185
 
3.3%
7 9177
 
3.3%
Other values (2) 8863
 
3.2%

verbatimDepth
Text

Missing 

Distinct59
Distinct (%)4.0%
Missing336961
Missing (%)99.6%
Memory size2.6 MiB
2025-01-14T11:34:52.541953image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length91
Median length10
Mean length8.625422583
Min length2

Characters and Unicode

Total characters12757
Distinct characters51
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique29 ?
Unique (%)2.0%

Sample

1st rowto 1 m
2nd rowintertidal
3rd row<0.5 m
4th rowintertidal
5th rowintertidal
ValueCountFrequency (%)
intertidal 778
40.5%
m 259
 
13.5%
surface 253
 
13.2%
to 103
 
5.4%
1 95
 
4.9%
0-1 84
 
4.4%
intertida 84
 
4.4%
0.5 68
 
3.5%
1m 47
 
2.4%
cm 13
 
0.7%
Other values (55) 138
 
7.2%
2025-01-14T11:34:52.692228image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
t 1871
14.7%
i 1380
10.8%
e 1167
9.1%
a 1150
9.0%
r 1147
9.0%
n 891
 
7.0%
d 877
 
6.9%
l 806
 
6.3%
443
 
3.5%
I 353
 
2.8%
Other values (41) 2672
20.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 10679
83.7%
Uppercase Letter 644
 
5.0%
Decimal Number 596
 
4.7%
Space Separator 443
 
3.5%
Math Symbol 161
 
1.3%
Other Punctuation 108
 
0.8%
Dash Punctuation 102
 
0.8%
Open Punctuation 12
 
0.1%
Close Punctuation 12
 
0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t 1871
17.5%
i 1380
12.9%
e 1167
10.9%
a 1150
10.8%
r 1147
10.7%
n 891
8.3%
d 877
8.2%
l 806
7.5%
m 347
 
3.2%
c 260
 
2.4%
Other values (12) 783
7.3%
Decimal Number
ValueCountFrequency (%)
1 252
42.3%
0 198
33.2%
5 82
 
13.8%
2 29
 
4.9%
3 12
 
2.0%
4 5
 
0.8%
6 5
 
0.8%
8 5
 
0.8%
9 4
 
0.7%
7 4
 
0.7%
Uppercase Letter
ValueCountFrequency (%)
I 353
54.8%
S 258
40.1%
M 12
 
1.9%
C 10
 
1.6%
A 5
 
0.8%
U 4
 
0.6%
V 2
 
0.3%
Other Punctuation
ValueCountFrequency (%)
. 81
75.0%
: 14
 
13.0%
" 6
 
5.6%
, 4
 
3.7%
; 3
 
2.8%
Math Symbol
ValueCountFrequency (%)
< 106
65.8%
+ 36
 
22.4%
~ 19
 
11.8%
Space Separator
ValueCountFrequency (%)
443
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 102
100.0%
Open Punctuation
ValueCountFrequency (%)
( 12
100.0%
Close Punctuation
ValueCountFrequency (%)
) 12
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 11323
88.8%
Common 1434
 
11.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
t 1871
16.5%
i 1380
12.2%
e 1167
10.3%
a 1150
10.2%
r 1147
10.1%
n 891
7.9%
d 877
7.7%
l 806
7.1%
I 353
 
3.1%
m 347
 
3.1%
Other values (19) 1334
11.8%
Common
ValueCountFrequency (%)
443
30.9%
1 252
17.6%
0 198
13.8%
< 106
 
7.4%
- 102
 
7.1%
5 82
 
5.7%
. 81
 
5.6%
+ 36
 
2.5%
2 29
 
2.0%
~ 19
 
1.3%
Other values (12) 86
 
6.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 12757
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
t 1871
14.7%
i 1380
10.8%
e 1167
9.1%
a 1150
9.0%
r 1147
9.0%
n 891
 
7.0%
d 877
 
6.9%
l 806
 
6.3%
443
 
3.5%
I 353
 
2.8%
Other values (41) 2672
20.9%

locationRemarks
Text

Missing 

Distinct2
Distinct (%)100.0%
Missing338438
Missing (%)> 99.9%
Memory size2.6 MiB
2025-01-14T11:34:52.750371image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length40
Median length30.5
Mean length30.5
Min length21

Characters and Unicode

Total characters61
Distinct characters26
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)100.0%

Sample

1st rowCarpenter, Kent E.; Williams, Jeffrey T.
2nd rowKirkbride, J. H., Jr.
ValueCountFrequency (%)
carpenter 1
10.0%
kent 1
10.0%
e 1
10.0%
williams 1
10.0%
jeffrey 1
10.0%
t 1
10.0%
kirkbride 1
10.0%
j 1
10.0%
h 1
10.0%
jr 1
10.0%
2025-01-14T11:34:52.872591image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
8
13.1%
r 6
 
9.8%
e 6
 
9.8%
. 5
 
8.2%
i 4
 
6.6%
, 4
 
6.6%
J 3
 
4.9%
l 2
 
3.3%
a 2
 
3.3%
K 2
 
3.3%
Other values (16) 19
31.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 33
54.1%
Other Punctuation 10
 
16.4%
Uppercase Letter 10
 
16.4%
Space Separator 8
 
13.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
r 6
18.2%
e 6
18.2%
i 4
12.1%
l 2
 
6.1%
a 2
 
6.1%
f 2
 
6.1%
t 2
 
6.1%
n 2
 
6.1%
b 1
 
3.0%
k 1
 
3.0%
Other values (5) 5
15.2%
Uppercase Letter
ValueCountFrequency (%)
J 3
30.0%
K 2
20.0%
T 1
 
10.0%
C 1
 
10.0%
W 1
 
10.0%
E 1
 
10.0%
H 1
 
10.0%
Other Punctuation
ValueCountFrequency (%)
. 5
50.0%
, 4
40.0%
; 1
 
10.0%
Space Separator
ValueCountFrequency (%)
8
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 43
70.5%
Common 18
29.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
r 6
14.0%
e 6
14.0%
i 4
 
9.3%
J 3
 
7.0%
l 2
 
4.7%
a 2
 
4.7%
K 2
 
4.7%
f 2
 
4.7%
t 2
 
4.7%
n 2
 
4.7%
Other values (12) 12
27.9%
Common
ValueCountFrequency (%)
8
44.4%
. 5
27.8%
, 4
22.2%
; 1
 
5.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 61
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
8
13.1%
r 6
 
9.8%
e 6
 
9.8%
. 5
 
8.2%
i 4
 
6.6%
, 4
 
6.6%
J 3
 
4.9%
l 2
 
3.3%
a 2
 
3.3%
K 2
 
3.3%
Other values (16) 19
31.1%

decimalLatitude
Text

Missing 

Distinct22664
Distinct (%)8.6%
Missing73885
Missing (%)21.8%
Memory size2.6 MiB
2025-01-14T11:34:53.077933image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length132
Median length7
Mean length6.755619814
Min length3

Characters and Unicode

Total characters1787233
Distinct characters44
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3759 ?
Unique (%)1.4%

Sample

1st row31.434
2nd row27.5772
3rd row-17.4756
4th row28.0392
5th row0.293
ValueCountFrequency (%)
12.0832 1368
 
0.5%
16.802 1085
 
0.4%
22.0 898
 
0.3%
31.7306 895
 
0.3%
5.0 792
 
0.3%
17.4726 765
 
0.3%
38.6141 727
 
0.3%
34.9606 682
 
0.3%
17.4825 682
 
0.3%
9.82436 665
 
0.3%
Other values (22436) 256022
96.8%
2025-01-14T11:34:53.357568image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
. 264551
14.8%
3 224973
12.6%
1 184720
10.3%
2 162894
9.1%
7 157735
8.8%
4 151760
8.5%
8 134446
7.5%
5 129696
7.3%
6 125673
7.0%
9 107992
6.0%
Other values (34) 142793
8.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1469719
82.2%
Other Punctuation 264577
 
14.8%
Dash Punctuation 52585
 
2.9%
Lowercase Letter 296
 
< 0.1%
Uppercase Letter 30
 
< 0.1%
Space Separator 26
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 37
12.5%
a 37
12.5%
e 33
11.1%
t 28
9.5%
r 26
8.8%
o 23
 
7.8%
h 14
 
4.7%
n 12
 
4.1%
p 12
 
4.1%
s 12
 
4.1%
Other values (9) 62
20.9%
Uppercase Letter
ValueCountFrequency (%)
A 7
23.3%
C 6
20.0%
O 5
16.7%
N 3
10.0%
V 2
 
6.7%
L 2
 
6.7%
S 1
 
3.3%
I 1
 
3.3%
E 1
 
3.3%
H 1
 
3.3%
Decimal Number
ValueCountFrequency (%)
3 224973
15.3%
1 184720
12.6%
2 162894
11.1%
7 157735
10.7%
4 151760
10.3%
8 134446
9.1%
5 129696
8.8%
6 125673
8.6%
9 107992
7.3%
0 89830
 
6.1%
Other Punctuation
ValueCountFrequency (%)
. 264551
> 99.9%
, 26
 
< 0.1%
Dash Punctuation
ValueCountFrequency (%)
- 52585
100.0%
Space Separator
ValueCountFrequency (%)
26
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1786907
> 99.9%
Latin 326
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
i 37
11.3%
a 37
11.3%
e 33
 
10.1%
t 28
 
8.6%
r 26
 
8.0%
o 23
 
7.1%
h 14
 
4.3%
n 12
 
3.7%
p 12
 
3.7%
s 12
 
3.7%
Other values (20) 92
28.2%
Common
ValueCountFrequency (%)
. 264551
14.8%
3 224973
12.6%
1 184720
10.3%
2 162894
9.1%
7 157735
8.8%
4 151760
8.5%
8 134446
7.5%
5 129696
7.3%
6 125673
7.0%
9 107992
6.0%
Other values (4) 142467
8.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1787233
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 264551
14.8%
3 224973
12.6%
1 184720
10.3%
2 162894
9.1%
7 157735
8.8%
4 151760
8.5%
8 134446
7.5%
5 129696
7.3%
6 125673
7.0%
9 107992
6.0%
Other values (34) 142793
8.0%

decimalLongitude
Text

Missing 

Distinct21527
Distinct (%)8.1%
Missing73885
Missing (%)21.8%
Memory size2.6 MiB
2025-01-14T11:34:53.576846image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length8
Mean length7.488620514
Min length3

Characters and Unicode

Total characters1981152
Distinct characters18
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3516 ?
Unique (%)1.3%

Sample

1st row-110.285
2nd row-111.45
3rd row-149.842
4th row85.9858
5th row36.899
ValueCountFrequency (%)
68.8991 1350
 
0.5%
56.1167 1222
 
0.5%
149.826 1219
 
0.5%
88.082 1101
 
0.4%
149.775 1056
 
0.4%
110.881 913
 
0.3%
88.0817 836
 
0.3%
80.2986 744
 
0.3%
90.2589 732
 
0.3%
176.0 682
 
0.3%
Other values (21343) 254700
96.3%
2025-01-14T11:34:53.860322image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
. 264551
13.4%
1 242225
12.2%
- 216857
10.9%
8 175296
8.8%
7 174521
8.8%
9 158734
8.0%
6 132600
6.7%
4 129719
6.5%
2 127821
6.5%
5 123226
6.2%
Other values (8) 235602
11.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1499712
75.7%
Other Punctuation 264551
 
13.4%
Dash Punctuation 216857
 
10.9%
Lowercase Letter 28
 
< 0.1%
Uppercase Letter 4
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 242225
16.2%
8 175296
11.7%
7 174521
11.6%
9 158734
10.6%
6 132600
8.8%
4 129719
8.6%
2 127821
8.5%
5 123226
8.2%
3 121814
8.1%
0 113756
7.6%
Lowercase Letter
ValueCountFrequency (%)
i 8
28.6%
a 8
28.6%
n 4
14.3%
m 4
14.3%
l 4
14.3%
Other Punctuation
ValueCountFrequency (%)
. 264551
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 216857
100.0%
Uppercase Letter
ValueCountFrequency (%)
A 4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1981120
> 99.9%
Latin 32
 
< 0.1%

Most frequent character per script

Common
ValueCountFrequency (%)
. 264551
13.4%
1 242225
12.2%
- 216857
10.9%
8 175296
8.8%
7 174521
8.8%
9 158734
8.0%
6 132600
6.7%
4 129719
6.5%
2 127821
6.5%
5 123226
6.2%
Other values (2) 235570
11.9%
Latin
ValueCountFrequency (%)
i 8
25.0%
a 8
25.0%
A 4
12.5%
n 4
12.5%
m 4
12.5%
l 4
12.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1981152
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 264551
13.4%
1 242225
12.2%
- 216857
10.9%
8 175296
8.8%
7 174521
8.8%
9 158734
8.0%
6 132600
6.7%
4 129719
6.5%
2 127821
6.5%
5 123226
6.2%
Other values (8) 235602
11.9%

geodeticDatum
Text

Missing 

Distinct14
Distinct (%)< 0.1%
Missing308301
Missing (%)91.1%
Memory size2.6 MiB
2025-01-14T11:34:53.931698image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length28
Median length18
Mean length13.85878762
Min length5

Characters and Unicode

Total characters417690
Distinct characters41
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)< 0.1%

Sample

1st rowWGS 84 (EPSG:4326)
2nd rowWGS84
3rd rowWGS 84 (EPSG:4326)
4th rowWGS 84 (EPSG:4326)
5th rowWGS 84 (EPSG:4326)
ValueCountFrequency (%)
wgs 20207
28.5%
84 20207
28.5%
epsg:4326 19946
28.1%
wgs84 7743
 
10.9%
wgs1984 1331
 
1.9%
not 320
 
0.5%
recorded 320
 
0.5%
nad27 242
 
0.3%
nad83 230
 
0.3%
epsg:4269 145
 
0.2%
Other values (9) 176
 
0.2%
2025-01-14T11:34:54.046498image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
G 49396
11.8%
S 49372
11.8%
4 49372
11.8%
40728
9.8%
8 29587
 
7.1%
W 29281
 
7.0%
2 20357
 
4.9%
3 20176
 
4.8%
( 20091
 
4.8%
E 20091
 
4.8%
Other values (31) 89239
21.4%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 170341
40.8%
Decimal Number 142780
34.2%
Space Separator 40728
 
9.8%
Open Punctuation 20091
 
4.8%
Other Punctuation 20091
 
4.8%
Close Punctuation 20091
 
4.8%
Lowercase Letter 3568
 
0.9%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 777
21.8%
o 668
18.7%
d 667
18.7%
r 381
10.7%
t 372
10.4%
c 344
9.6%
a 116
 
3.3%
n 42
 
1.2%
l 38
 
1.1%
k 38
 
1.1%
Other values (6) 125
 
3.5%
Uppercase Letter
ValueCountFrequency (%)
G 49396
29.0%
S 49372
29.0%
W 29281
17.2%
E 20091
11.8%
P 20091
11.8%
N 775
 
0.5%
D 496
 
0.3%
A 473
 
0.3%
R 302
 
0.2%
C 40
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
4 49372
34.6%
8 29587
20.7%
2 20357
14.3%
3 20176
14.1%
6 20091
14.1%
9 1476
 
1.0%
1 1369
 
1.0%
7 242
 
0.2%
0 72
 
0.1%
5 38
 
< 0.1%
Space Separator
ValueCountFrequency (%)
40728
100.0%
Open Punctuation
ValueCountFrequency (%)
( 20091
100.0%
Other Punctuation
ValueCountFrequency (%)
: 20091
100.0%
Close Punctuation
ValueCountFrequency (%)
) 20091
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 243781
58.4%
Latin 173909
41.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
G 49396
28.4%
S 49372
28.4%
W 29281
16.8%
E 20091
11.6%
P 20091
11.6%
e 777
 
0.4%
N 775
 
0.4%
o 668
 
0.4%
d 667
 
0.4%
D 496
 
0.3%
Other values (17) 2295
 
1.3%
Common
ValueCountFrequency (%)
4 49372
20.3%
40728
16.7%
8 29587
12.1%
2 20357
8.4%
3 20176
8.3%
( 20091
8.2%
: 20091
8.2%
6 20091
8.2%
) 20091
8.2%
9 1476
 
0.6%
Other values (4) 1721
 
0.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 417690
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
G 49396
11.8%
S 49372
11.8%
4 49372
11.8%
40728
9.8%
8 29587
 
7.1%
W 29281
 
7.0%
2 20357
 
4.9%
3 20176
 
4.8%
( 20091
 
4.8%
E 20091
 
4.8%
Other values (31) 89239
21.4%
Distinct456
Distinct (%)4.1%
Missing327413
Missing (%)96.7%
Memory size2.6 MiB
2025-01-14T11:34:54.231367image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length14
Median length7
Mean length3.433481455
Min length1

Characters and Unicode

Total characters37861
Distinct characters27
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique42 ?
Unique (%)0.4%

Sample

1st row500
2nd row500
3rd row140000
4th row100
5th row100
ValueCountFrequency (%)
100 1572
 
14.3%
5 436
 
4.0%
14 402
 
3.6%
12 386
 
3.5%
500 366
 
3.3%
10 311
 
2.8%
32 277
 
2.5%
200 273
 
2.5%
15 256
 
2.3%
23 231
 
2.1%
Other values (446) 6517
59.1%
2025-01-14T11:34:54.492505image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 8138
21.5%
1 6725
17.8%
2 4908
13.0%
5 3562
9.4%
4 3253
 
8.6%
3 3020
 
8.0%
6 2085
 
5.5%
8 1726
 
4.6%
9 1655
 
4.4%
7 1538
 
4.1%
Other values (17) 1251
 
3.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 36610
96.7%
Other Punctuation 1210
 
3.2%
Lowercase Letter 37
 
0.1%
Uppercase Letter 4
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 6
16.2%
t 5
13.5%
n 4
10.8%
c 3
8.1%
o 3
8.1%
p 3
8.1%
e 3
8.1%
r 2
 
5.4%
y 2
 
5.4%
g 2
 
5.4%
Other values (3) 4
10.8%
Decimal Number
ValueCountFrequency (%)
0 8138
22.2%
1 6725
18.4%
2 4908
13.4%
5 3562
9.7%
4 3253
 
8.9%
3 3020
 
8.2%
6 2085
 
5.7%
8 1726
 
4.7%
9 1655
 
4.5%
7 1538
 
4.2%
Uppercase Letter
ValueCountFrequency (%)
A 2
50.0%
I 1
25.0%
E 1
25.0%
Other Punctuation
ValueCountFrequency (%)
. 1210
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 37820
99.9%
Latin 41
 
0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
i 6
14.6%
t 5
12.2%
n 4
9.8%
c 3
 
7.3%
o 3
 
7.3%
p 3
 
7.3%
e 3
 
7.3%
A 2
 
4.9%
r 2
 
4.9%
y 2
 
4.9%
Other values (6) 8
19.5%
Common
ValueCountFrequency (%)
0 8138
21.5%
1 6725
17.8%
2 4908
13.0%
5 3562
9.4%
4 3253
 
8.6%
3 3020
 
8.0%
6 2085
 
5.5%
8 1726
 
4.6%
9 1655
 
4.4%
7 1538
 
4.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 37861
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 8138
21.5%
1 6725
17.8%
2 4908
13.0%
5 3562
9.4%
4 3253
 
8.6%
3 3020
 
8.0%
6 2085
 
5.5%
8 1726
 
4.6%
9 1655
 
4.4%
7 1538
 
4.1%
Other values (17) 1251
 
3.3%

coordinatePrecision
Text

Missing 

Distinct4
Distinct (%)100.0%
Missing338436
Missing (%)> 99.9%
Memory size2.6 MiB
2025-01-14T11:34:54.557250image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length13
Median length12.5
Mean length12.25
Min length11

Characters and Unicode

Total characters49
Distinct characters20
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4 ?
Unique (%)100.0%

Sample

1st rowCharaciformes
2nd rowHoplonemertea
3rd rowSiluriformes
4th rowLepidoptera
ValueCountFrequency (%)
characiformes 1
25.0%
hoplonemertea 1
25.0%
siluriformes 1
25.0%
lepidoptera 1
25.0%
2025-01-14T11:34:54.673320image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 7
14.3%
r 6
12.2%
o 5
10.2%
a 4
 
8.2%
i 4
 
8.2%
m 3
 
6.1%
p 3
 
6.1%
s 2
 
4.1%
t 2
 
4.1%
f 2
 
4.1%
Other values (10) 11
22.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 45
91.8%
Uppercase Letter 4
 
8.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 7
15.6%
r 6
13.3%
o 5
11.1%
a 4
8.9%
i 4
8.9%
m 3
6.7%
p 3
6.7%
s 2
 
4.4%
t 2
 
4.4%
f 2
 
4.4%
Other values (6) 7
15.6%
Uppercase Letter
ValueCountFrequency (%)
L 1
25.0%
S 1
25.0%
C 1
25.0%
H 1
25.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 49
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 7
14.3%
r 6
12.2%
o 5
10.2%
a 4
 
8.2%
i 4
 
8.2%
m 3
 
6.1%
p 3
 
6.1%
s 2
 
4.1%
t 2
 
4.1%
f 2
 
4.1%
Other values (10) 11
22.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 49
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 7
14.3%
r 6
12.2%
o 5
10.2%
a 4
 
8.2%
i 4
 
8.2%
m 3
 
6.1%
p 3
 
6.1%
s 2
 
4.1%
t 2
 
4.1%
f 2
 
4.1%
Other values (10) 11
22.4%

verbatimCoordinates
Text

Missing 

Distinct4
Distinct (%)100.0%
Missing338436
Missing (%)> 99.9%
Memory size2.6 MiB
2025-01-14T11:34:54.729919image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length19
Median length12.5
Mean length13.5
Min length10

Characters and Unicode

Total characters54
Distinct characters17
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4 ?
Unique (%)100.0%

Sample

1st rowCharacidae
2nd rowOtotyphlonemertidae
3rd rowCallichthyidae
4th rowLimacodidae
ValueCountFrequency (%)
characidae 1
25.0%
ototyphlonemertidae 1
25.0%
callichthyidae 1
25.0%
limacodidae 1
25.0%
2025-01-14T11:34:54.842727image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 8
14.8%
i 6
11.1%
e 6
11.1%
d 5
9.3%
h 4
 
7.4%
t 4
 
7.4%
c 3
 
5.6%
o 3
 
5.6%
l 3
 
5.6%
C 2
 
3.7%
Other values (7) 10
18.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 50
92.6%
Uppercase Letter 4
 
7.4%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 8
16.0%
i 6
12.0%
e 6
12.0%
d 5
10.0%
h 4
8.0%
t 4
8.0%
c 3
 
6.0%
o 3
 
6.0%
l 3
 
6.0%
r 2
 
4.0%
Other values (4) 6
12.0%
Uppercase Letter
ValueCountFrequency (%)
C 2
50.0%
O 1
25.0%
L 1
25.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 54
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 8
14.8%
i 6
11.1%
e 6
11.1%
d 5
9.3%
h 4
 
7.4%
t 4
 
7.4%
c 3
 
5.6%
o 3
 
5.6%
l 3
 
5.6%
C 2
 
3.7%
Other values (7) 10
18.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 54
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 8
14.8%
i 6
11.1%
e 6
11.1%
d 5
9.3%
h 4
 
7.4%
t 4
 
7.4%
c 3
 
5.6%
o 3
 
5.6%
l 3
 
5.6%
C 2
 
3.7%
Other values (7) 10
18.5%

verbatimLatitude
Text

Missing 

Distinct7618
Distinct (%)7.0%
Missing230082
Missing (%)68.0%
Memory size2.6 MiB
2025-01-14T11:34:55.023623image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length38
Median length29
Mean length9.909577512
Min length1

Characters and Unicode

Total characters1073782
Distinct characters40
Distinct categories11 ?
Distinct scripts2 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1979 ?
Unique (%)1.8%

Sample

1st row27.57721471
2nd row-17.47564
3rd row27 25.347 N
4th row17 28 57.5 S
5th row36.22739648
ValueCountFrequency (%)
n 36304
 
14.9%
s 13596
 
5.6%
17 3921
 
1.6%
12 3357
 
1.4%
27 3322
 
1.4%
36 3186
 
1.3%
3135
 
1.3%
35 3063
 
1.3%
16 3017
 
1.2%
38 2723
 
1.1%
Other values (6455) 168722
69.1%
2025-01-14T11:34:55.447496image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
135988
12.7%
1 101490
9.5%
3 91003
8.5%
2 86585
 
8.1%
. 84857
 
7.9%
4 78858
 
7.3%
7 73653
 
6.9%
0 73576
 
6.9%
5 70329
 
6.5%
8 59987
 
5.6%
Other values (30) 217456
20.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 748860
69.7%
Space Separator 135988
 
12.7%
Other Punctuation 97236
 
9.1%
Uppercase Letter 58993
 
5.5%
Dash Punctuation 29085
 
2.7%
Other Symbol 3208
 
0.3%
Lowercase Letter 387
 
< 0.1%
Modifier Letter 22
 
< 0.1%
Modifier Symbol 1
 
< 0.1%
Other Letter 1
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 101490
13.6%
3 91003
12.2%
2 86585
11.6%
4 78858
10.5%
7 73653
9.8%
0 73576
9.8%
5 70329
9.4%
8 59987
8.0%
9 56962
7.6%
6 56417
7.5%
Lowercase Letter
ValueCountFrequency (%)
e 122
31.5%
d 116
30.0%
g 116
30.0%
a 13
 
3.4%
r 5
 
1.3%
n 5
 
1.3%
t 5
 
1.3%
c 3
 
0.8%
s 2
 
0.5%
Other Punctuation
ValueCountFrequency (%)
. 84857
87.3%
; 4725
 
4.9%
' 4356
 
4.5%
" 3260
 
3.4%
: 26
 
< 0.1%
, 6
 
< 0.1%
? 4
 
< 0.1%
/ 2
 
< 0.1%
Uppercase Letter
ValueCountFrequency (%)
N 43888
74.4%
S 15037
 
25.5%
W 58
 
0.1%
M 5
 
< 0.1%
L 5
 
< 0.1%
Dash Punctuation
ValueCountFrequency (%)
- 29059
99.9%
26
 
0.1%
Space Separator
ValueCountFrequency (%)
135988
100.0%
Other Symbol
ValueCountFrequency (%)
° 3208
100.0%
Modifier Letter
ValueCountFrequency (%)
ʹ 22
100.0%
Modifier Symbol
ValueCountFrequency (%)
˚ 1
100.0%
Other Letter
ValueCountFrequency (%)
º 1
100.0%
Final Punctuation
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1014401
94.5%
Latin 59381
 
5.5%

Most frequent character per script

Common
ValueCountFrequency (%)
135988
13.4%
1 101490
10.0%
3 91003
9.0%
2 86585
8.5%
. 84857
8.4%
4 78858
7.8%
7 73653
7.3%
0 73576
7.3%
5 70329
6.9%
8 59987
 
5.9%
Other values (15) 158075
15.6%
Latin
ValueCountFrequency (%)
N 43888
73.9%
S 15037
 
25.3%
e 122
 
0.2%
d 116
 
0.2%
g 116
 
0.2%
W 58
 
0.1%
a 13
 
< 0.1%
r 5
 
< 0.1%
n 5
 
< 0.1%
M 5
 
< 0.1%
Other values (5) 16
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1070523
99.7%
None 3209
 
0.3%
Punctuation 27
 
< 0.1%
Modifier Letters 23
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
135988
12.7%
1 101490
9.5%
3 91003
8.5%
2 86585
8.1%
. 84857
 
7.9%
4 78858
 
7.4%
7 73653
 
6.9%
0 73576
 
6.9%
5 70329
 
6.6%
8 59987
 
5.6%
Other values (24) 214197
20.0%
None
ValueCountFrequency (%)
° 3208
> 99.9%
º 1
 
< 0.1%
Punctuation
ValueCountFrequency (%)
26
96.3%
1
 
3.7%
Modifier Letters
ValueCountFrequency (%)
ʹ 22
95.7%
˚ 1
 
4.3%

verbatimLongitude
Text

Missing 

Distinct7632
Distinct (%)7.0%
Missing230109
Missing (%)68.0%
Memory size2.6 MiB
2025-01-14T11:34:55.643509image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length255
Median length32
Mean length10.73394504
Min length2

Characters and Unicode

Total characters1162819
Distinct characters41
Distinct categories11 ?
Distinct scripts2 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1960 ?
Unique (%)1.8%

Sample

1st row-111.4495292
2nd row-149.84247
3rd row79 56.156 W
4th row149 53 59.6 W
5th row-122.879564
ValueCountFrequency (%)
w 38487
 
15.8%
e 11389
 
4.7%
3089
 
1.3%
149 2722
 
1.1%
53 2149
 
0.9%
68 1810
 
0.7%
075 1719
 
0.7%
79 1634
 
0.7%
55 1499
 
0.6%
77 1463
 
0.6%
Other values (6616) 178159
73.0%
2025-01-14T11:34:55.906614image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
135789
11.7%
1 112912
9.7%
0 89717
 
7.7%
8 85734
 
7.4%
. 84821
 
7.3%
5 83305
 
7.2%
7 79392
 
6.8%
2 77607
 
6.7%
4 76922
 
6.6%
9 73881
 
6.4%
Other values (31) 262739
22.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 809633
69.6%
Space Separator 135789
 
11.7%
Other Punctuation 97302
 
8.4%
Uppercase Letter 59020
 
5.1%
Dash Punctuation 57456
 
4.9%
Other Symbol 3208
 
0.3%
Lowercase Letter 385
 
< 0.1%
Modifier Letter 22
 
< 0.1%
Final Punctuation 2
 
< 0.1%
Modifier Symbol 1
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 112912
13.9%
0 89717
11.1%
8 85734
10.6%
5 83305
10.3%
7 79392
9.8%
2 77607
9.6%
4 76922
9.5%
9 73881
9.1%
3 71552
8.8%
6 58611
7.2%
Lowercase Letter
ValueCountFrequency (%)
e 122
31.7%
g 116
30.1%
d 116
30.1%
n 10
 
2.6%
a 7
 
1.8%
r 5
 
1.3%
o 5
 
1.3%
c 2
 
0.5%
s 2
 
0.5%
Other Punctuation
ValueCountFrequency (%)
. 84821
87.2%
; 4611
 
4.7%
' 4322
 
4.4%
" 3255
 
3.3%
# 255
 
0.3%
: 26
 
< 0.1%
, 8
 
< 0.1%
? 4
 
< 0.1%
Uppercase Letter
ValueCountFrequency (%)
W 47002
79.6%
E 11954
 
20.3%
S 49
 
0.1%
N 5
 
< 0.1%
M 5
 
< 0.1%
L 5
 
< 0.1%
Dash Punctuation
ValueCountFrequency (%)
- 57430
> 99.9%
26
 
< 0.1%
Space Separator
ValueCountFrequency (%)
135789
100.0%
Other Symbol
ValueCountFrequency (%)
° 3208
100.0%
Modifier Letter
ValueCountFrequency (%)
ʹ 22
100.0%
Final Punctuation
ValueCountFrequency (%)
2
100.0%
Modifier Symbol
ValueCountFrequency (%)
˚ 1
100.0%
Other Letter
ValueCountFrequency (%)
º 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1103413
94.9%
Latin 59406
 
5.1%

Most frequent character per script

Common
ValueCountFrequency (%)
135789
12.3%
1 112912
10.2%
0 89717
8.1%
8 85734
7.8%
. 84821
7.7%
5 83305
7.5%
7 79392
 
7.2%
2 77607
 
7.0%
4 76922
 
7.0%
9 73881
 
6.7%
Other values (15) 203333
18.4%
Latin
ValueCountFrequency (%)
W 47002
79.1%
E 11954
 
20.1%
e 122
 
0.2%
g 116
 
0.2%
d 116
 
0.2%
S 49
 
0.1%
n 10
 
< 0.1%
a 7
 
< 0.1%
N 5
 
< 0.1%
r 5
 
< 0.1%
Other values (6) 20
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1159559
99.7%
None 3209
 
0.3%
Punctuation 28
 
< 0.1%
Modifier Letters 23
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
135789
11.7%
1 112912
9.7%
0 89717
 
7.7%
8 85734
 
7.4%
. 84821
 
7.3%
5 83305
 
7.2%
7 79392
 
6.8%
2 77607
 
6.7%
4 76922
 
6.6%
9 73881
 
6.4%
Other values (25) 259479
22.4%
None
ValueCountFrequency (%)
° 3208
> 99.9%
º 1
 
< 0.1%
Punctuation
ValueCountFrequency (%)
26
92.9%
2
 
7.1%
Modifier Letters
ValueCountFrequency (%)
ʹ 22
95.7%
˚ 1
 
4.3%
Distinct6
Distinct (%)0.1%
Missing329369
Missing (%)97.3%
Memory size2.6 MiB
2025-01-14T11:34:55.969970image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length23
Median length23
Mean length22.74655496
Min length3

Characters and Unicode

Total characters206334
Distinct characters25
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowDegrees Minutes Seconds
2nd rowDegrees Minutes Seconds
3rd rowDegrees Minutes Seconds
4th rowDegrees Minutes Seconds
5th rowDegrees Minutes Seconds
ValueCountFrequency (%)
degrees 8924
33.1%
minutes 8849
32.8%
seconds 8849
32.8%
township 107
 
0.4%
range 107
 
0.4%
decimal 75
 
0.3%
utm 24
 
0.1%
unknown 16
 
0.1%
2025-01-14T11:34:56.084550image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 44652
21.6%
s 26729
13.0%
n 17960
 
8.7%
17880
 
8.7%
g 9031
 
4.4%
i 9031
 
4.4%
D 8988
 
4.4%
o 8972
 
4.3%
c 8924
 
4.3%
r 8924
 
4.3%
Other values (15) 45243
21.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 161466
78.3%
Uppercase Letter 26988
 
13.1%
Space Separator 17880
 
8.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 44652
27.7%
s 26729
16.6%
n 17960
11.1%
g 9031
 
5.6%
i 9031
 
5.6%
o 8972
 
5.6%
c 8924
 
5.5%
r 8924
 
5.5%
d 8860
 
5.5%
u 8849
 
5.5%
Other values (8) 9534
 
5.9%
Uppercase Letter
ValueCountFrequency (%)
D 8988
33.3%
M 8873
32.9%
S 8849
32.8%
T 131
 
0.5%
R 107
 
0.4%
U 40
 
0.1%
Space Separator
ValueCountFrequency (%)
17880
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 188454
91.3%
Common 17880
 
8.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 44652
23.7%
s 26729
14.2%
n 17960
9.5%
g 9031
 
4.8%
i 9031
 
4.8%
D 8988
 
4.8%
o 8972
 
4.8%
c 8924
 
4.7%
r 8924
 
4.7%
M 8873
 
4.7%
Other values (14) 36370
19.3%
Common
ValueCountFrequency (%)
17880
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 206334
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 44652
21.6%
s 26729
13.0%
n 17960
 
8.7%
17880
 
8.7%
g 9031
 
4.4%
i 9031
 
4.4%
D 8988
 
4.4%
o 8972
 
4.3%
c 8924
 
4.3%
r 8924
 
4.3%
Other values (15) 45243
21.9%

verbatimSRS
Text

Missing 

Distinct4
Distinct (%)100.0%
Missing338436
Missing (%)> 99.9%
Memory size2.6 MiB
2025-01-14T11:34:56.138407image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length17
Median length10
Mean length10.75
Min length6

Characters and Unicode

Total characters43
Distinct characters20
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4 ?
Unique (%)100.0%

Sample

1st rowMoenkhausia
2nd rowOtotyphlonemertes
3rd rowCorydoras
4th rowParasa
ValueCountFrequency (%)
moenkhausia 1
25.0%
ototyphlonemertes 1
25.0%
corydoras 1
25.0%
parasa 1
25.0%
2025-01-14T11:34:56.248123image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 6
14.0%
o 5
11.6%
e 4
 
9.3%
s 4
 
9.3%
r 4
 
9.3%
t 3
 
7.0%
n 2
 
4.7%
h 2
 
4.7%
y 2
 
4.7%
M 1
 
2.3%
Other values (10) 10
23.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 39
90.7%
Uppercase Letter 4
 
9.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 6
15.4%
o 5
12.8%
e 4
10.3%
s 4
10.3%
r 4
10.3%
t 3
7.7%
n 2
 
5.1%
h 2
 
5.1%
y 2
 
5.1%
l 1
 
2.6%
Other values (6) 6
15.4%
Uppercase Letter
ValueCountFrequency (%)
M 1
25.0%
C 1
25.0%
O 1
25.0%
P 1
25.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 43
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 6
14.0%
o 5
11.6%
e 4
 
9.3%
s 4
 
9.3%
r 4
 
9.3%
t 3
 
7.0%
n 2
 
4.7%
h 2
 
4.7%
y 2
 
4.7%
M 1
 
2.3%
Other values (10) 10
23.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 43
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 6
14.0%
o 5
11.6%
e 4
 
9.3%
s 4
 
9.3%
r 4
 
9.3%
t 3
 
7.0%
n 2
 
4.7%
h 2
 
4.7%
y 2
 
4.7%
M 1
 
2.3%
Other values (10) 10
23.3%

footprintSpatialFit
Text

Missing 

Distinct4
Distinct (%)100.0%
Missing338436
Missing (%)> 99.9%
Memory size2.6 MiB
2025-01-14T11:34:56.308441image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length23
Median length20
Mean length19.75
Min length16

Characters and Unicode

Total characters79
Distinct characters23
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4 ?
Unique (%)100.0%

Sample

1st rowChampsodon nudivittis
2nd rowCoccocypselum guianense
3rd rowEmoia caeruleocauda
4th rowDimorphandra sp.
ValueCountFrequency (%)
champsodon 1
12.5%
nudivittis 1
12.5%
coccocypselum 1
12.5%
guianense 1
12.5%
emoia 1
12.5%
caeruleocauda 1
12.5%
dimorphandra 1
12.5%
sp 1
12.5%
2025-01-14T11:34:56.433388image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 8
 
10.1%
o 7
 
8.9%
i 6
 
7.6%
c 5
 
6.3%
s 5
 
6.3%
n 5
 
6.3%
u 5
 
6.3%
e 5
 
6.3%
m 4
 
5.1%
p 4
 
5.1%
Other values (13) 25
31.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 70
88.6%
Space Separator 4
 
5.1%
Uppercase Letter 4
 
5.1%
Other Punctuation 1
 
1.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 8
11.4%
o 7
10.0%
i 6
 
8.6%
c 5
 
7.1%
s 5
 
7.1%
n 5
 
7.1%
u 5
 
7.1%
e 5
 
7.1%
m 4
 
5.7%
p 4
 
5.7%
Other values (8) 16
22.9%
Uppercase Letter
ValueCountFrequency (%)
C 2
50.0%
E 1
25.0%
D 1
25.0%
Space Separator
ValueCountFrequency (%)
4
100.0%
Other Punctuation
ValueCountFrequency (%)
. 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 74
93.7%
Common 5
 
6.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 8
 
10.8%
o 7
 
9.5%
i 6
 
8.1%
c 5
 
6.8%
s 5
 
6.8%
n 5
 
6.8%
u 5
 
6.8%
e 5
 
6.8%
m 4
 
5.4%
p 4
 
5.4%
Other values (11) 20
27.0%
Common
ValueCountFrequency (%)
4
80.0%
. 1
 
20.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 79
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 8
 
10.1%
o 7
 
8.9%
i 6
 
7.6%
c 5
 
6.3%
s 5
 
6.3%
n 5
 
6.3%
u 5
 
6.3%
e 5
 
6.3%
m 4
 
5.1%
p 4
 
5.1%
Other values (13) 25
31.6%

georeferencedBy
Text

Missing 

Distinct4
Distinct (%)100.0%
Missing338436
Missing (%)> 99.9%
Memory size2.6 MiB
2025-01-14T11:34:56.489141image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length14
Median length11
Mean length9
Min length7

Characters and Unicode

Total characters36
Distinct characters15
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4 ?
Unique (%)100.0%

Sample

1st rowhemigrammoides
2nd rowpallida
3rd rowbicolor
4th rowhilarula
ValueCountFrequency (%)
hemigrammoides 1
25.0%
pallida 1
25.0%
bicolor 1
25.0%
hilarula 1
25.0%
2025-01-14T11:34:56.606138image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
i 5
13.9%
a 5
13.9%
l 5
13.9%
m 3
8.3%
r 3
8.3%
o 3
8.3%
h 2
 
5.6%
e 2
 
5.6%
d 2
 
5.6%
g 1
 
2.8%
Other values (5) 5
13.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 36
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 5
13.9%
a 5
13.9%
l 5
13.9%
m 3
8.3%
r 3
8.3%
o 3
8.3%
h 2
 
5.6%
e 2
 
5.6%
d 2
 
5.6%
g 1
 
2.8%
Other values (5) 5
13.9%

Most occurring scripts

ValueCountFrequency (%)
Latin 36
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
i 5
13.9%
a 5
13.9%
l 5
13.9%
m 3
8.3%
r 3
8.3%
o 3
8.3%
h 2
 
5.6%
e 2
 
5.6%
d 2
 
5.6%
g 1
 
2.8%
Other values (5) 5
13.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 36
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i 5
13.9%
a 5
13.9%
l 5
13.9%
m 3
8.3%
r 3
8.3%
o 3
8.3%
h 2
 
5.6%
e 2
 
5.6%
d 2
 
5.6%
g 1
 
2.8%
Other values (5) 5
13.9%

georeferenceProtocol
Text

Missing 

Distinct172
Distinct (%)0.2%
Missing255527
Missing (%)75.5%
Memory size2.6 MiB
2025-01-14T11:34:56.741666image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length228
Median length12
Mean length16.00867174
Min length3

Characters and Unicode

Total characters1327327
Distinct characters69
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique10 ?
Unique (%)< 0.1%

Sample

1st rowGoogle Earth
2nd rowGoogle Earth
3rd rowGoogle Earth
4th rowGeoLocate
5th rowGoogle Earth
ValueCountFrequency (%)
google 50746
24.4%
earth 44746
21.5%
gps 24192
 
11.6%
maps 6426
 
3.1%
georeferencing 4999
 
2.4%
and 3624
 
1.7%
pro 3253
 
1.6%
for 3180
 
1.5%
to 3180
 
1.5%
best 3179
 
1.5%
Other values (336) 60537
29.1%
2025-01-14T11:34:56.951933image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
o 140045
 
10.6%
125149
 
9.4%
e 116930
 
8.8%
r 91937
 
6.9%
G 90373
 
6.8%
a 86896
 
6.5%
t 73045
 
5.5%
g 60345
 
4.5%
l 58359
 
4.4%
h 52505
 
4.0%
Other values (59) 431743
32.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 893231
67.3%
Uppercase Letter 249335
 
18.8%
Space Separator 125149
 
9.4%
Other Punctuation 25688
 
1.9%
Decimal Number 24656
 
1.9%
Close Punctuation 4365
 
0.3%
Open Punctuation 4365
 
0.3%
Dash Punctuation 538
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o 140045
15.7%
e 116930
13.1%
r 91937
10.3%
a 86896
9.7%
t 73045
8.2%
g 60345
6.8%
l 58359
6.5%
h 52505
 
5.9%
i 37111
 
4.2%
n 36414
 
4.1%
Other values (15) 139644
15.6%
Uppercase Letter
ValueCountFrequency (%)
G 90373
36.2%
E 47266
19.0%
P 31313
 
12.6%
S 30537
 
12.2%
M 8773
 
3.5%
N 6265
 
2.5%
C 5723
 
2.3%
I 4595
 
1.8%
B 3747
 
1.5%
W 3527
 
1.4%
Other values (13) 17216
 
6.9%
Decimal Number
ValueCountFrequency (%)
0 9145
37.1%
2 5815
23.6%
6 3905
15.8%
1 2229
 
9.0%
7 1433
 
5.8%
9 918
 
3.7%
4 562
 
2.3%
5 509
 
2.1%
3 84
 
0.3%
8 56
 
0.2%
Other Punctuation
ValueCountFrequency (%)
. 11882
46.3%
, 6287
24.5%
/ 5944
23.1%
: 1370
 
5.3%
& 153
 
0.6%
! 40
 
0.2%
; 12
 
< 0.1%
Space Separator
ValueCountFrequency (%)
125149
100.0%
Close Punctuation
ValueCountFrequency (%)
) 4365
100.0%
Open Punctuation
ValueCountFrequency (%)
( 4365
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 538
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1142566
86.1%
Common 184761
 
13.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
o 140045
12.3%
e 116930
 
10.2%
r 91937
 
8.0%
G 90373
 
7.9%
a 86896
 
7.6%
t 73045
 
6.4%
g 60345
 
5.3%
l 58359
 
5.1%
h 52505
 
4.6%
E 47266
 
4.1%
Other values (38) 324865
28.4%
Common
ValueCountFrequency (%)
125149
67.7%
. 11882
 
6.4%
0 9145
 
4.9%
, 6287
 
3.4%
/ 5944
 
3.2%
2 5815
 
3.1%
) 4365
 
2.4%
( 4365
 
2.4%
6 3905
 
2.1%
1 2229
 
1.2%
Other values (11) 5675
 
3.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1327327
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
o 140045
 
10.6%
125149
 
9.4%
e 116930
 
8.8%
r 91937
 
6.9%
G 90373
 
6.8%
a 86896
 
6.5%
t 73045
 
5.5%
g 60345
 
4.5%
l 58359
 
4.4%
h 52505
 
4.0%
Other values (59) 431743
32.5%

georeferenceRemarks
Text

Missing 

Distinct224
Distinct (%)2.4%
Missing328933
Missing (%)97.2%
Memory size2.6 MiB
2025-01-14T11:34:57.136428image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length83
Median length51
Mean length18.5300305
Min length2

Characters and Unicode

Total characters176165
Distinct characters63
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique19 ?
Unique (%)0.2%

Sample

1st rowMax error (m): 100
2nd rowMax error (m): 40
3rd rowLocality extent = 1.6
4th rowLocality extent = 1 mile
5th rowMax error (m): 200
ValueCountFrequency (%)
m 5357
14.5%
max 4970
13.5%
error 4970
13.5%
1992
 
5.4%
locality 1821
 
4.9%
extent 1820
 
4.9%
100 1766
 
4.8%
50 916
 
2.5%
200 740
 
2.0%
4 668
 
1.8%
Other values (241) 11826
32.1%
2025-01-14T11:34:57.405987image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
27339
15.5%
r 16685
 
9.5%
e 10655
 
6.0%
o 10235
 
5.8%
a 10121
 
5.7%
t 9615
 
5.5%
0 8344
 
4.7%
x 7007
 
4.0%
m 6541
 
3.7%
n 5380
 
3.1%
Other values (53) 64243
36.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 95057
54.0%
Space Separator 27339
 
15.5%
Decimal Number 20326
 
11.5%
Uppercase Letter 12887
 
7.3%
Other Punctuation 8468
 
4.8%
Open Punctuation 4973
 
2.8%
Close Punctuation 4973
 
2.8%
Math Symbol 1820
 
1.0%
Dash Punctuation 322
 
0.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
r 16685
17.6%
e 10655
11.2%
o 10235
10.8%
a 10121
10.6%
t 9615
10.1%
x 7007
7.4%
m 6541
 
6.9%
n 5380
 
5.7%
i 4180
 
4.4%
l 2812
 
3.0%
Other values (13) 11826
12.4%
Uppercase Letter
ValueCountFrequency (%)
M 4970
38.6%
L 1896
 
14.7%
S 1114
 
8.6%
E 1103
 
8.6%
W 999
 
7.8%
G 775
 
6.0%
C 408
 
3.2%
H 372
 
2.9%
V 253
 
2.0%
R 238
 
1.8%
Other values (9) 759
 
5.9%
Decimal Number
ValueCountFrequency (%)
0 8344
41.1%
1 3656
18.0%
5 2481
 
12.2%
2 1520
 
7.5%
4 1399
 
6.9%
8 936
 
4.6%
6 713
 
3.5%
3 496
 
2.4%
7 441
 
2.2%
9 340
 
1.7%
Other Punctuation
ValueCountFrequency (%)
: 4970
58.7%
. 1939
 
22.9%
; 1214
 
14.3%
, 311
 
3.7%
/ 31
 
0.4%
' 3
 
< 0.1%
Space Separator
ValueCountFrequency (%)
27339
100.0%
Open Punctuation
ValueCountFrequency (%)
( 4973
100.0%
Close Punctuation
ValueCountFrequency (%)
) 4973
100.0%
Math Symbol
ValueCountFrequency (%)
= 1820
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 322
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 107944
61.3%
Common 68221
38.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
r 16685
15.5%
e 10655
9.9%
o 10235
9.5%
a 10121
9.4%
t 9615
8.9%
x 7007
 
6.5%
m 6541
 
6.1%
n 5380
 
5.0%
M 4970
 
4.6%
i 4180
 
3.9%
Other values (32) 22555
20.9%
Common
ValueCountFrequency (%)
27339
40.1%
0 8344
 
12.2%
( 4973
 
7.3%
) 4973
 
7.3%
: 4970
 
7.3%
1 3656
 
5.4%
5 2481
 
3.6%
. 1939
 
2.8%
= 1820
 
2.7%
2 1520
 
2.2%
Other values (11) 6206
 
9.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 176165
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
27339
15.5%
r 16685
 
9.5%
e 10655
 
6.0%
o 10235
 
5.8%
a 10121
 
5.7%
t 9615
 
5.5%
0 8344
 
4.7%
x 7007
 
4.0%
m 6541
 
3.7%
n 5380
 
3.1%
Other values (53) 64243
36.5%

geologicalContextID
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing338439
Missing (%)> 99.9%
Memory size2.6 MiB
2025-01-14T11:34:57.463515image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length12
Median length12
Mean length12
Min length12

Characters and Unicode

Total characters12
Distinct characters10
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st row(Keferstein)
ValueCountFrequency (%)
keferstein 1
100.0%
2025-01-14T11:34:57.560370image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 3
25.0%
( 1
 
8.3%
K 1
 
8.3%
f 1
 
8.3%
r 1
 
8.3%
s 1
 
8.3%
t 1
 
8.3%
i 1
 
8.3%
n 1
 
8.3%
) 1
 
8.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 9
75.0%
Open Punctuation 1
 
8.3%
Uppercase Letter 1
 
8.3%
Close Punctuation 1
 
8.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 3
33.3%
f 1
 
11.1%
r 1
 
11.1%
s 1
 
11.1%
t 1
 
11.1%
i 1
 
11.1%
n 1
 
11.1%
Open Punctuation
ValueCountFrequency (%)
( 1
100.0%
Uppercase Letter
ValueCountFrequency (%)
K 1
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 10
83.3%
Common 2
 
16.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 3
30.0%
K 1
 
10.0%
f 1
 
10.0%
r 1
 
10.0%
s 1
 
10.0%
t 1
 
10.0%
i 1
 
10.0%
n 1
 
10.0%
Common
ValueCountFrequency (%)
( 1
50.0%
) 1
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 12
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 3
25.0%
( 1
 
8.3%
K 1
 
8.3%
f 1
 
8.3%
r 1
 
8.3%
s 1
 
8.3%
t 1
 
8.3%
i 1
 
8.3%
n 1
 
8.3%
) 1
 
8.3%
Distinct4
Distinct (%)100.0%
Missing338436
Missing (%)> 99.9%
Memory size2.6 MiB
2025-01-14T11:34:57.638178image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length134
Median length71
Mean length83.5
Min length58

Characters and Unicode

Total characters334
Distinct characters35
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4 ?
Unique (%)100.0%

Sample

1st rowAnimalia, Chordata, Vertebrata, Osteichthyes, Actinopterygii, Neopterygii, Acanthopterygii, Perciformes, Trachinoidei, Champsodontidae
2nd rowPlantae, Dicotyledonae, Gentianales, Rubiaceae, Rubioideae
3rd rowAnimalia, Chordata, Vertebrata, Reptilia, Squamata, Sauria, Scincidae, Eugongylinae
4th rowPlantae, Dicotyledonae, Fabales, Fabaceae, Caesalpinioideae
ValueCountFrequency (%)
animalia 2
 
7.1%
plantae 2
 
7.1%
chordata 2
 
7.1%
dicotyledonae 2
 
7.1%
vertebrata 2
 
7.1%
actinopterygii 1
 
3.6%
rubioideae 1
 
3.6%
fabaceae 1
 
3.6%
fabales 1
 
3.6%
eugongylinae 1
 
3.6%
Other values (13) 13
46.4%
2025-01-14T11:34:57.779620image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 43
12.9%
e 35
 
10.5%
i 32
 
9.6%
24
 
7.2%
, 24
 
7.2%
t 21
 
6.3%
n 16
 
4.8%
o 16
 
4.8%
r 13
 
3.9%
l 11
 
3.3%
Other values (25) 99
29.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 258
77.2%
Uppercase Letter 28
 
8.4%
Space Separator 24
 
7.2%
Other Punctuation 24
 
7.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 43
16.7%
e 35
13.6%
i 32
12.4%
t 21
8.1%
n 16
 
6.2%
o 16
 
6.2%
r 13
 
5.0%
l 11
 
4.3%
c 11
 
4.3%
d 10
 
3.9%
Other values (10) 50
19.4%
Uppercase Letter
ValueCountFrequency (%)
A 4
14.3%
C 4
14.3%
S 3
10.7%
P 3
10.7%
R 3
10.7%
D 2
7.1%
F 2
7.1%
V 2
7.1%
T 1
 
3.6%
G 1
 
3.6%
Other values (3) 3
10.7%
Space Separator
ValueCountFrequency (%)
24
100.0%
Other Punctuation
ValueCountFrequency (%)
, 24
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 286
85.6%
Common 48
 
14.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 43
15.0%
e 35
12.2%
i 32
11.2%
t 21
 
7.3%
n 16
 
5.6%
o 16
 
5.6%
r 13
 
4.5%
l 11
 
3.8%
c 11
 
3.8%
d 10
 
3.5%
Other values (23) 78
27.3%
Common
ValueCountFrequency (%)
24
50.0%
, 24
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 334
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 43
12.9%
e 35
 
10.5%
i 32
 
9.6%
24
 
7.2%
, 24
 
7.2%
t 21
 
6.3%
n 16
 
4.8%
o 16
 
4.8%
r 13
 
3.9%
l 11
 
3.3%
Other values (25) 99
29.6%
Distinct2
Distinct (%)50.0%
Missing338436
Missing (%)> 99.9%
Memory size2.6 MiB
2025-01-14T11:34:57.831137image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length7.5
Mean length7.5
Min length7

Characters and Unicode

Total characters30
Distinct characters9
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowAnimalia
2nd rowPlantae
3rd rowAnimalia
4th rowPlantae
ValueCountFrequency (%)
animalia 2
50.0%
plantae 2
50.0%
2025-01-14T11:34:57.930742image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 8
26.7%
n 4
13.3%
i 4
13.3%
l 4
13.3%
A 2
 
6.7%
m 2
 
6.7%
P 2
 
6.7%
t 2
 
6.7%
e 2
 
6.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 26
86.7%
Uppercase Letter 4
 
13.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 8
30.8%
n 4
15.4%
i 4
15.4%
l 4
15.4%
m 2
 
7.7%
t 2
 
7.7%
e 2
 
7.7%
Uppercase Letter
ValueCountFrequency (%)
A 2
50.0%
P 2
50.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 30
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 8
26.7%
n 4
13.3%
i 4
13.3%
l 4
13.3%
A 2
 
6.7%
m 2
 
6.7%
P 2
 
6.7%
t 2
 
6.7%
e 2
 
6.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 30
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 8
26.7%
n 4
13.3%
i 4
13.3%
l 4
13.3%
A 2
 
6.7%
m 2
 
6.7%
P 2
 
6.7%
t 2
 
6.7%
e 2
 
6.7%

earliestEraOrLowestErathem
Text

Constant  Missing 

Distinct1
Distinct (%)50.0%
Missing338438
Missing (%)> 99.9%
Memory size2.6 MiB
2025-01-14T11:34:57.974453image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters16
Distinct characters7
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowChordata
2nd rowChordata
ValueCountFrequency (%)
chordata 2
100.0%
2025-01-14T11:34:58.068666image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 4
25.0%
C 2
12.5%
h 2
12.5%
o 2
12.5%
r 2
12.5%
d 2
12.5%
t 2
12.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 14
87.5%
Uppercase Letter 2
 
12.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 4
28.6%
h 2
14.3%
o 2
14.3%
r 2
14.3%
d 2
14.3%
t 2
14.3%
Uppercase Letter
ValueCountFrequency (%)
C 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 16
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 4
25.0%
C 2
12.5%
h 2
12.5%
o 2
12.5%
r 2
12.5%
d 2
12.5%
t 2
12.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 16
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 4
25.0%
C 2
12.5%
h 2
12.5%
o 2
12.5%
r 2
12.5%
d 2
12.5%
t 2
12.5%
Distinct3
Distinct (%)75.0%
Missing338436
Missing (%)> 99.9%
Memory size2.6 MiB
2025-01-14T11:34:58.117739image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length14
Median length13.5
Mean length12
Min length8

Characters and Unicode

Total characters48
Distinct characters16
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)50.0%

Sample

1st rowActinopterygii
2nd rowDicotyledonae
3rd rowReptilia
4th rowDicotyledonae
ValueCountFrequency (%)
dicotyledonae 2
50.0%
actinopterygii 1
25.0%
reptilia 1
25.0%
2025-01-14T11:34:58.226329image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
i 7
14.6%
e 6
12.5%
o 5
10.4%
t 5
10.4%
c 3
 
6.2%
y 3
 
6.2%
l 3
 
6.2%
n 3
 
6.2%
a 3
 
6.2%
D 2
 
4.2%
Other values (6) 8
16.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 44
91.7%
Uppercase Letter 4
 
8.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 7
15.9%
e 6
13.6%
o 5
11.4%
t 5
11.4%
c 3
6.8%
y 3
6.8%
l 3
6.8%
n 3
6.8%
a 3
6.8%
d 2
 
4.5%
Other values (3) 4
9.1%
Uppercase Letter
ValueCountFrequency (%)
D 2
50.0%
A 1
25.0%
R 1
25.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 48
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
i 7
14.6%
e 6
12.5%
o 5
10.4%
t 5
10.4%
c 3
 
6.2%
y 3
 
6.2%
l 3
 
6.2%
n 3
 
6.2%
a 3
 
6.2%
D 2
 
4.2%
Other values (6) 8
16.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 48
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i 7
14.6%
e 6
12.5%
o 5
10.4%
t 5
10.4%
c 3
 
6.2%
y 3
 
6.2%
l 3
 
6.2%
n 3
 
6.2%
a 3
 
6.2%
D 2
 
4.2%
Other values (6) 8
16.7%
Distinct4
Distinct (%)100.0%
Missing338436
Missing (%)> 99.9%
Memory size2.6 MiB
2025-01-14T11:34:58.276439image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length11
Median length9.5
Mean length9.25
Min length7

Characters and Unicode

Total characters37
Distinct characters19
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4 ?
Unique (%)100.0%

Sample

1st rowPerciformes
2nd rowGentianales
3rd rowSquamata
4th rowFabales
ValueCountFrequency (%)
perciformes 1
25.0%
gentianales 1
25.0%
squamata 1
25.0%
fabales 1
25.0%
2025-01-14T11:34:58.393236image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 7
18.9%
e 5
13.5%
s 3
 
8.1%
r 2
 
5.4%
i 2
 
5.4%
m 2
 
5.4%
n 2
 
5.4%
t 2
 
5.4%
l 2
 
5.4%
P 1
 
2.7%
Other values (9) 9
24.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 33
89.2%
Uppercase Letter 4
 
10.8%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 7
21.2%
e 5
15.2%
s 3
9.1%
r 2
 
6.1%
i 2
 
6.1%
m 2
 
6.1%
n 2
 
6.1%
t 2
 
6.1%
l 2
 
6.1%
u 1
 
3.0%
Other values (5) 5
15.2%
Uppercase Letter
ValueCountFrequency (%)
P 1
25.0%
S 1
25.0%
F 1
25.0%
G 1
25.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 37
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 7
18.9%
e 5
13.5%
s 3
 
8.1%
r 2
 
5.4%
i 2
 
5.4%
m 2
 
5.4%
n 2
 
5.4%
t 2
 
5.4%
l 2
 
5.4%
P 1
 
2.7%
Other values (9) 9
24.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 37
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 7
18.9%
e 5
13.5%
s 3
 
8.1%
r 2
 
5.4%
i 2
 
5.4%
m 2
 
5.4%
n 2
 
5.4%
t 2
 
5.4%
l 2
 
5.4%
P 1
 
2.7%
Other values (9) 9
24.3%
Distinct4
Distinct (%)100.0%
Missing338436
Missing (%)> 99.9%
Memory size2.6 MiB
2025-01-14T11:34:58.445706image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length15
Median length12
Mean length10.25
Min length8

Characters and Unicode

Total characters41
Distinct characters18
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4 ?
Unique (%)100.0%

Sample

1st rowChampsodontidae
2nd rowRubiaceae
3rd rowScincidae
4th rowFabaceae
ValueCountFrequency (%)
champsodontidae 1
25.0%
rubiaceae 1
25.0%
scincidae 1
25.0%
fabaceae 1
25.0%
2025-01-14T11:34:58.561793image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 8
19.5%
e 6
14.6%
c 4
9.8%
i 4
9.8%
d 3
 
7.3%
b 2
 
4.9%
o 2
 
4.9%
n 2
 
4.9%
C 1
 
2.4%
R 1
 
2.4%
Other values (8) 8
19.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 37
90.2%
Uppercase Letter 4
 
9.8%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 8
21.6%
e 6
16.2%
c 4
10.8%
i 4
10.8%
d 3
 
8.1%
b 2
 
5.4%
o 2
 
5.4%
n 2
 
5.4%
u 1
 
2.7%
t 1
 
2.7%
Other values (4) 4
10.8%
Uppercase Letter
ValueCountFrequency (%)
C 1
25.0%
R 1
25.0%
S 1
25.0%
F 1
25.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 41
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 8
19.5%
e 6
14.6%
c 4
9.8%
i 4
9.8%
d 3
 
7.3%
b 2
 
4.9%
o 2
 
4.9%
n 2
 
4.9%
C 1
 
2.4%
R 1
 
2.4%
Other values (8) 8
19.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 41
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 8
19.5%
e 6
14.6%
c 4
9.8%
i 4
9.8%
d 3
 
7.3%
b 2
 
4.9%
o 2
 
4.9%
n 2
 
4.9%
C 1
 
2.4%
R 1
 
2.4%
Other values (8) 8
19.5%
Distinct4
Distinct (%)100.0%
Missing338436
Missing (%)> 99.9%
Memory size2.6 MiB
2025-01-14T11:34:58.613689image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length13
Median length11
Mean length10
Min length5

Characters and Unicode

Total characters40
Distinct characters18
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4 ?
Unique (%)100.0%

Sample

1st rowChampsodon
2nd rowCoccocypselum
3rd rowEmoia
4th rowDimorphandra
ValueCountFrequency (%)
champsodon 1
25.0%
coccocypselum 1
25.0%
emoia 1
25.0%
dimorphandra 1
25.0%
2025-01-14T11:34:58.730280image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
o 6
15.0%
a 4
 
10.0%
m 4
 
10.0%
c 3
 
7.5%
p 3
 
7.5%
n 2
 
5.0%
i 2
 
5.0%
h 2
 
5.0%
C 2
 
5.0%
d 2
 
5.0%
Other values (8) 10
25.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 36
90.0%
Uppercase Letter 4
 
10.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o 6
16.7%
a 4
11.1%
m 4
11.1%
c 3
8.3%
p 3
8.3%
n 2
 
5.6%
i 2
 
5.6%
h 2
 
5.6%
d 2
 
5.6%
s 2
 
5.6%
Other values (5) 6
16.7%
Uppercase Letter
ValueCountFrequency (%)
C 2
50.0%
E 1
25.0%
D 1
25.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 40
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
o 6
15.0%
a 4
 
10.0%
m 4
 
10.0%
c 3
 
7.5%
p 3
 
7.5%
n 2
 
5.0%
i 2
 
5.0%
h 2
 
5.0%
C 2
 
5.0%
d 2
 
5.0%
Other values (8) 10
25.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 40
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
o 6
15.0%
a 4
 
10.0%
m 4
 
10.0%
c 3
 
7.5%
p 3
 
7.5%
n 2
 
5.0%
i 2
 
5.0%
h 2
 
5.0%
C 2
 
5.0%
d 2
 
5.0%
Other values (8) 10
25.0%

formation
Text

Missing 

Distinct4
Distinct (%)100.0%
Missing338436
Missing (%)> 99.9%
Memory size2.6 MiB
2025-01-14T11:34:58.788077image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length13
Median length9.5
Mean length8.75
Min length3

Characters and Unicode

Total characters35
Distinct characters16
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4 ?
Unique (%)100.0%

Sample

1st rownudivittis
2nd rowguianense
3rd rowcaeruleocauda
4th rowsp.
ValueCountFrequency (%)
nudivittis 1
25.0%
guianense 1
25.0%
caeruleocauda 1
25.0%
sp 1
25.0%
2025-01-14T11:34:58.896349image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
u 4
11.4%
i 4
11.4%
a 4
11.4%
e 4
11.4%
n 3
8.6%
s 3
8.6%
d 2
 
5.7%
t 2
 
5.7%
c 2
 
5.7%
v 1
 
2.9%
Other values (6) 6
17.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 34
97.1%
Other Punctuation 1
 
2.9%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
u 4
11.8%
i 4
11.8%
a 4
11.8%
e 4
11.8%
n 3
8.8%
s 3
8.8%
d 2
 
5.9%
t 2
 
5.9%
c 2
 
5.9%
v 1
 
2.9%
Other values (5) 5
14.7%
Other Punctuation
ValueCountFrequency (%)
. 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 34
97.1%
Common 1
 
2.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
u 4
11.8%
i 4
11.8%
a 4
11.8%
e 4
11.8%
n 3
8.8%
s 3
8.8%
d 2
 
5.9%
t 2
 
5.9%
c 2
 
5.9%
v 1
 
2.9%
Other values (5) 5
14.7%
Common
ValueCountFrequency (%)
. 1
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 35
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
u 4
11.4%
i 4
11.4%
a 4
11.4%
e 4
11.4%
n 3
8.6%
s 3
8.6%
d 2
 
5.7%
t 2
 
5.7%
c 2
 
5.7%
v 1
 
2.9%
Other values (6) 6
17.1%
Distinct17
Distinct (%)0.3%
Missing333367
Missing (%)98.5%
Memory size2.6 MiB
2025-01-14T11:34:58.948318image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length17
Median length9
Mean length5.2578356
Min length2

Characters and Unicode

Total characters26673
Distinct characters28
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowaff.
2nd rowcf.
3rd rowaff.
4th rowuncertain
5th rowuncertain
ValueCountFrequency (%)
cf 2742
53.8%
uncertain 1860
36.5%
aff 320
 
6.3%
near 75
 
1.5%
complex 38
 
0.7%
sp 16
 
0.3%
group 12
 
0.2%
n 10
 
0.2%
nov 6
 
0.1%
s.l 5
 
0.1%
Other values (5) 12
 
0.2%
2025-01-14T11:34:59.065705image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
c 4635
17.4%
n 3811
14.3%
f 3382
12.7%
. 2735
10.3%
a 2239
8.4%
e 1978
7.4%
r 1947
7.3%
t 1860
7.0%
i 1860
7.0%
u 1846
 
6.9%
Other values (18) 380
 
1.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 23860
89.5%
Other Punctuation 2735
 
10.3%
Uppercase Letter 53
 
0.2%
Space Separator 23
 
0.1%
Open Punctuation 1
 
< 0.1%
Close Punctuation 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
c 4635
19.4%
n 3811
16.0%
f 3382
14.2%
a 2239
9.4%
e 1978
8.3%
r 1947
8.2%
t 1860
7.8%
i 1860
7.8%
u 1846
 
7.7%
p 66
 
0.3%
Other values (9) 236
 
1.0%
Uppercase Letter
ValueCountFrequency (%)
U 28
52.8%
A 17
32.1%
C 6
 
11.3%
K 1
 
1.9%
S 1
 
1.9%
Other Punctuation
ValueCountFrequency (%)
. 2735
100.0%
Space Separator
ValueCountFrequency (%)
23
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 23913
89.7%
Common 2760
 
10.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
c 4635
19.4%
n 3811
15.9%
f 3382
14.1%
a 2239
9.4%
e 1978
8.3%
r 1947
8.1%
t 1860
7.8%
i 1860
7.8%
u 1846
 
7.7%
p 66
 
0.3%
Other values (14) 289
 
1.2%
Common
ValueCountFrequency (%)
. 2735
99.1%
23
 
0.8%
( 1
 
< 0.1%
) 1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 26673
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
c 4635
17.4%
n 3811
14.3%
f 3382
12.7%
. 2735
10.3%
a 2239
8.4%
e 1978
7.4%
r 1947
7.3%
t 1860
7.0%
i 1860
7.0%
u 1846
 
6.9%
Other values (18) 380
 
1.4%

typeStatus
Text

Missing 

Distinct33
Distinct (%)0.5%
Missing331835
Missing (%)98.0%
Memory size2.6 MiB
2025-01-14T11:34:59.130487image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length27
Median length8
Mean length8.101438304
Min length2

Characters and Unicode

Total characters53510
Distinct characters39
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowParatype
2nd rowParatype
3rd rowParatype
4th rowParatype
5th rowParatype
ValueCountFrequency (%)
paratype 5799
86.7%
holotype 332
 
5.0%
paralectotype 125
 
1.9%
cotype 86
 
1.3%
syntype 78
 
1.2%
type 73
 
1.1%
of 34
 
0.5%
paratopotype 33
 
0.5%
allotype 23
 
0.3%
ms 22
 
0.3%
Other values (22) 80
 
1.2%
2025-01-14T11:34:59.259511image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 11984
22.4%
e 6769
12.6%
t 6709
12.5%
y 6671
12.5%
p 6663
12.5%
r 5984
11.2%
P 5961
11.1%
o 1043
 
1.9%
l 522
 
1.0%
H 338
 
0.6%
Other values (29) 866
 
1.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 46739
87.3%
Uppercase Letter 6685
 
12.5%
Space Separator 80
 
0.1%
Other Punctuation 6
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 11984
25.6%
e 6769
14.5%
t 6709
14.4%
y 6671
14.3%
p 6663
14.3%
r 5984
12.8%
o 1043
 
2.2%
l 522
 
1.1%
c 148
 
0.3%
n 93
 
0.2%
Other values (10) 153
 
0.3%
Uppercase Letter
ValueCountFrequency (%)
P 5961
89.2%
H 338
 
5.1%
T 92
 
1.4%
C 88
 
1.3%
S 78
 
1.2%
O 34
 
0.5%
M 25
 
0.4%
A 24
 
0.4%
N 13
 
0.2%
L 12
 
0.2%
Other values (6) 20
 
0.3%
Other Punctuation
ValueCountFrequency (%)
? 4
66.7%
; 2
33.3%
Space Separator
ValueCountFrequency (%)
80
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 53424
99.8%
Common 86
 
0.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 11984
22.4%
e 6769
12.7%
t 6709
12.6%
y 6671
12.5%
p 6663
12.5%
r 5984
11.2%
P 5961
11.2%
o 1043
 
2.0%
l 522
 
1.0%
H 338
 
0.6%
Other values (26) 780
 
1.5%
Common
ValueCountFrequency (%)
80
93.0%
? 4
 
4.7%
; 2
 
2.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 53510
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 11984
22.4%
e 6769
12.6%
t 6709
12.5%
y 6671
12.5%
p 6663
12.5%
r 5984
11.2%
P 5961
11.1%
o 1043
 
1.9%
l 522
 
1.0%
H 338
 
0.6%
Other values (29) 866
 
1.6%

identifiedBy
Text

Missing 

Distinct1866
Distinct (%)1.7%
Missing226287
Missing (%)66.9%
Memory size2.6 MiB
2025-01-14T11:34:59.447897image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length150
Median length128
Mean length39.12073685
Min length2

Characters and Unicode

Total characters4387508
Distinct characters83
Distinct categories10 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique200 ?
Unique (%)0.2%

Sample

1st rowAnker, Arthur
2nd rowOsborn, Karen J., (IZ), Smithsonian Institution - National Museum of Natural History (UNITED STATES)
3rd rowBaldwin, Carole C.
4th rowHobbs, Horton H., Jr., Smithsonian Institution, National Museum of Natural History
5th rowPaulay, Gustav, University of Florida (UNITED STATES)
ValueCountFrequency (%)
united 36063
 
5.8%
states 36020
 
5.8%
of 27856
 
4.5%
smithsonian 24359
 
3.9%
22487
 
3.6%
institution 20525
 
3.3%
national 18667
 
3.0%
museum 17529
 
2.8%
natural 17249
 
2.8%
history 17170
 
2.8%
Other values (2280) 384533
61.8%
2025-01-14T11:34:59.725482image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
510305
 
11.6%
i 263200
 
6.0%
a 260172
 
5.9%
t 236846
 
5.4%
n 236742
 
5.4%
o 217779
 
5.0%
e 199321
 
4.5%
, 179315
 
4.1%
r 173463
 
4.0%
s 169712
 
3.9%
Other values (73) 1940653
44.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2460341
56.1%
Uppercase Letter 994498
22.7%
Space Separator 510305
 
11.6%
Other Punctuation 274335
 
6.3%
Close Punctuation 61794
 
1.4%
Open Punctuation 61794
 
1.4%
Dash Punctuation 23913
 
0.5%
Decimal Number 518
 
< 0.1%
Initial Punctuation 5
 
< 0.1%
Final Punctuation 5
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 263200
10.7%
a 260172
10.6%
t 236846
9.6%
n 236742
9.6%
o 217779
8.9%
e 199321
8.1%
r 173463
 
7.1%
s 169712
 
6.9%
l 137141
 
5.6%
u 122744
 
5.0%
Other values (27) 443221
18.0%
Uppercase Letter
ValueCountFrequency (%)
T 131860
13.3%
S 126119
12.7%
E 89252
 
9.0%
N 81544
 
8.2%
I 69952
 
7.0%
A 68290
 
6.9%
D 60287
 
6.1%
U 50369
 
5.1%
M 44706
 
4.5%
B 34118
 
3.4%
Other values (18) 238001
23.9%
Other Punctuation
ValueCountFrequency (%)
, 179315
65.4%
. 89290
32.5%
; 4450
 
1.6%
' 576
 
0.2%
& 430
 
0.2%
/ 274
 
0.1%
Decimal Number
ValueCountFrequency (%)
2 148
28.6%
4 74
14.3%
6 74
14.3%
0 74
14.3%
1 74
14.3%
9 74
14.3%
Space Separator
ValueCountFrequency (%)
510305
100.0%
Close Punctuation
ValueCountFrequency (%)
) 61794
100.0%
Open Punctuation
ValueCountFrequency (%)
( 61794
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 23913
100.0%
Initial Punctuation
ValueCountFrequency (%)
5
100.0%
Final Punctuation
ValueCountFrequency (%)
5
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 3454839
78.7%
Common 932669
 
21.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
i 263200
 
7.6%
a 260172
 
7.5%
t 236846
 
6.9%
n 236742
 
6.9%
o 217779
 
6.3%
e 199321
 
5.8%
r 173463
 
5.0%
s 169712
 
4.9%
l 137141
 
4.0%
T 131860
 
3.8%
Other values (55) 1428603
41.4%
Common
ValueCountFrequency (%)
510305
54.7%
, 179315
 
19.2%
. 89290
 
9.6%
) 61794
 
6.6%
( 61794
 
6.6%
- 23913
 
2.6%
; 4450
 
0.5%
' 576
 
0.1%
& 430
 
< 0.1%
/ 274
 
< 0.1%
Other values (8) 528
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4386974
> 99.9%
None 524
 
< 0.1%
Punctuation 10
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
510305
 
11.6%
i 263200
 
6.0%
a 260172
 
5.9%
t 236846
 
5.4%
n 236742
 
5.4%
o 217779
 
5.0%
e 199321
 
4.5%
, 179315
 
4.1%
r 173463
 
4.0%
s 169712
 
3.9%
Other values (58) 1940119
44.2%
None
ValueCountFrequency (%)
í 212
40.5%
ö 129
24.6%
á 99
18.9%
ø 29
 
5.5%
ú 26
 
5.0%
ó 12
 
2.3%
Ø 7
 
1.3%
ë 3
 
0.6%
è 3
 
0.6%
ñ 1
 
0.2%
Other values (3) 3
 
0.6%
Punctuation
ValueCountFrequency (%)
5
50.0%
5
50.0%

scientificName
Text

Missing 

Distinct46019
Distinct (%)14.6%
Missing24062
Missing (%)7.1%
Memory size2.6 MiB
2025-01-14T11:34:59.932561image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length85
Median length63
Mean length18.57501161
Min length3

Characters and Unicode

Total characters5839575
Distinct characters79
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique10046 ?
Unique (%)3.2%

Sample

1st rowRectiostoma fernaldella
2nd rowPolystichum sp.
3rd rowMesontoplatys bolzi
4th rowBursa granularis
5th rowAmanses scopas
ValueCountFrequency (%)
sp 50672
 
7.9%
plethodon 4677
 
0.7%
orconectes 4553
 
0.7%
indet 4208
 
0.7%
procambarus 3787
 
0.6%
unidentified 3704
 
0.6%
bathymodiolus 2599
 
0.4%
cinereus 2327
 
0.4%
riftia 2008
 
0.3%
truncatus 1928
 
0.3%
Other values (42986) 557516
87.4%
2025-01-14T11:35:00.216082image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 627874
 
10.8%
i 494432
 
8.5%
s 465939
 
8.0%
e 412196
 
7.1%
o 370175
 
6.3%
r 352777
 
6.0%
323601
 
5.5%
l 300324
 
5.1%
n 295293
 
5.1%
t 291408
 
5.0%
Other values (69) 1905556
32.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 5135444
87.9%
Space Separator 323601
 
5.5%
Uppercase Letter 317060
 
5.4%
Other Punctuation 57396
 
1.0%
Open Punctuation 2361
 
< 0.1%
Close Punctuation 2361
 
< 0.1%
Decimal Number 881
 
< 0.1%
Connector Punctuation 279
 
< 0.1%
Dash Punctuation 192
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 627874
12.2%
i 494432
 
9.6%
s 465939
 
9.1%
e 412196
 
8.0%
o 370175
 
7.2%
r 352777
 
6.9%
l 300324
 
5.8%
n 295293
 
5.8%
t 291408
 
5.7%
u 273496
 
5.3%
Other values (19) 1251530
24.4%
Uppercase Letter
ValueCountFrequency (%)
P 48132
15.2%
C 36686
11.6%
A 33497
10.6%
S 23837
 
7.5%
M 18826
 
5.9%
E 17864
 
5.6%
L 16391
 
5.2%
H 16038
 
5.1%
T 14052
 
4.4%
D 13559
 
4.3%
Other values (17) 78178
24.7%
Decimal Number
ValueCountFrequency (%)
0 270
30.6%
1 264
30.0%
2 99
 
11.2%
3 66
 
7.5%
6 56
 
6.4%
7 46
 
5.2%
8 26
 
3.0%
4 23
 
2.6%
5 17
 
1.9%
9 14
 
1.6%
Other Punctuation
ValueCountFrequency (%)
. 56743
98.9%
" 306
 
0.5%
' 252
 
0.4%
, 65
 
0.1%
/ 13
 
< 0.1%
& 11
 
< 0.1%
? 5
 
< 0.1%
# 1
 
< 0.1%
Space Separator
ValueCountFrequency (%)
323601
100.0%
Open Punctuation
ValueCountFrequency (%)
( 2361
100.0%
Close Punctuation
ValueCountFrequency (%)
) 2361
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 279
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 192
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 5452504
93.4%
Common 387071
 
6.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 627874
11.5%
i 494432
 
9.1%
s 465939
 
8.5%
e 412196
 
7.6%
o 370175
 
6.8%
r 352777
 
6.5%
l 300324
 
5.5%
n 295293
 
5.4%
t 291408
 
5.3%
u 273496
 
5.0%
Other values (46) 1568590
28.8%
Common
ValueCountFrequency (%)
323601
83.6%
. 56743
 
14.7%
( 2361
 
0.6%
) 2361
 
0.6%
" 306
 
0.1%
_ 279
 
0.1%
0 270
 
0.1%
1 264
 
0.1%
' 252
 
0.1%
- 192
 
< 0.1%
Other values (13) 442
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 5839562
> 99.9%
None 13
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 627874
 
10.8%
i 494432
 
8.5%
s 465939
 
8.0%
e 412196
 
7.1%
o 370175
 
6.3%
r 352777
 
6.0%
323601
 
5.5%
l 300324
 
5.1%
n 295293
 
5.1%
t 291408
 
5.0%
Other values (65) 1905543
32.6%
None
ValueCountFrequency (%)
ë 9
69.2%
ö 2
 
15.4%
Á 1
 
7.7%
é 1
 
7.7%

higherClassification
Text

Missing 

Distinct4815
Distinct (%)1.4%
Missing5901
Missing (%)1.7%
Memory size2.6 MiB
2025-01-14T11:35:00.422307image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length162
Median length142
Mean length76.5582473
Min length6

Characters and Unicode

Total characters25458603
Distinct characters68
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique458 ?
Unique (%)0.1%

Sample

1st rowAnimalia, Arthropoda, Insecta, Lepidoptera, Depressariidae, Stenomatinae
2nd rowAnimalia, Annelida, Polychaeta, Sedentaria, Canalipalpata, Sabellida, Siboglinidae
3rd rowAnimalia, Annelida, Polychaeta, Errantia, Amphinomida, Amphinomidae
4th rowAnimalia, Arthropoda, Crustacea, Malacostraca, Eumalacostraca, Eucarida, Decapoda, Pleocyemata, Cambaridae
5th rowPlantae, Pteridophyte, Polypodiales, Dryopteridaceae
ValueCountFrequency (%)
animalia 287708
 
13.0%
arthropoda 145883
 
6.6%
insecta 113237
 
5.1%
chordata 103543
 
4.7%
vertebrata 102503
 
4.6%
lepidoptera 79773
 
3.6%
actinopterygii 40747
 
1.8%
osteichthyes 40745
 
1.8%
neopterygii 40742
 
1.8%
plantae 35547
 
1.6%
Other values (5328) 1221137
55.2%
2025-01-14T11:35:00.709721image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 3312348
13.0%
i 2171272
 
8.5%
e 2153801
 
8.5%
1879026
 
7.4%
, 1876750
 
7.4%
t 1539991
 
6.0%
r 1526780
 
6.0%
o 1482836
 
5.8%
n 1001707
 
3.9%
d 934382
 
3.7%
Other values (58) 7579710
29.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 19488128
76.5%
Uppercase Letter 2209251
 
8.7%
Other Punctuation 1880599
 
7.4%
Space Separator 1879026
 
7.4%
Close Punctuation 715
 
< 0.1%
Open Punctuation 715
 
< 0.1%
Dash Punctuation 127
 
< 0.1%
Decimal Number 30
 
< 0.1%
Connector Punctuation 12
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 3312348
17.0%
i 2171272
11.1%
e 2153801
11.1%
t 1539991
7.9%
r 1526780
7.8%
o 1482836
7.6%
n 1001707
 
5.1%
d 934382
 
4.8%
l 865315
 
4.4%
c 813387
 
4.2%
Other values (17) 3686309
18.9%
Uppercase Letter
ValueCountFrequency (%)
A 616961
27.9%
C 270329
12.2%
P 208454
 
9.4%
M 125019
 
5.7%
I 120936
 
5.5%
E 116858
 
5.3%
L 112485
 
5.1%
V 112145
 
5.1%
S 86438
 
3.9%
D 72328
 
3.3%
Other values (16) 367298
16.6%
Decimal Number
ValueCountFrequency (%)
6 9
30.0%
0 6
20.0%
1 6
20.0%
3 6
20.0%
9 3
 
10.0%
Other Punctuation
ValueCountFrequency (%)
, 1876750
99.8%
. 3849
 
0.2%
Close Punctuation
ValueCountFrequency (%)
) 679
95.0%
] 36
 
5.0%
Open Punctuation
ValueCountFrequency (%)
( 679
95.0%
[ 36
 
5.0%
Dash Punctuation
ValueCountFrequency (%)
124
97.6%
- 3
 
2.4%
Space Separator
ValueCountFrequency (%)
1879026
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 12
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 21697379
85.2%
Common 3761224
 
14.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 3312348
15.3%
i 2171272
 
10.0%
e 2153801
 
9.9%
t 1539991
 
7.1%
r 1526780
 
7.0%
o 1482836
 
6.8%
n 1001707
 
4.6%
d 934382
 
4.3%
l 865315
 
4.0%
c 813387
 
3.7%
Other values (43) 5895560
27.2%
Common
ValueCountFrequency (%)
1879026
50.0%
, 1876750
49.9%
. 3849
 
0.1%
) 679
 
< 0.1%
( 679
 
< 0.1%
124
 
< 0.1%
[ 36
 
< 0.1%
] 36
 
< 0.1%
_ 12
 
< 0.1%
6 9
 
< 0.1%
Other values (5) 24
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 25458461
> 99.9%
Punctuation 124
 
< 0.1%
None 18
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 3312348
13.0%
i 2171272
 
8.5%
e 2153801
 
8.5%
1879026
 
7.4%
, 1876750
 
7.4%
t 1539991
 
6.0%
r 1526780
 
6.0%
o 1482836
 
5.8%
n 1001707
 
3.9%
d 934382
 
3.7%
Other values (56) 7579568
29.8%
Punctuation
ValueCountFrequency (%)
124
100.0%
None
ValueCountFrequency (%)
ö 18
100.0%

kingdom
Text

Missing 

Distinct10
Distinct (%)< 0.1%
Missing10613
Missing (%)3.1%
Memory size2.6 MiB
2025-01-14T11:35:00.770134image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length10
Median length8
Mean length7.904840053
Min length5

Characters and Unicode

Total characters2591420
Distinct characters25
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowAnimalia
2nd rowAnimalia
3rd rowAnimalia
4th rowAnimalia
5th rowPlantae
ValueCountFrequency (%)
animalia 287708
87.8%
plantae 35547
 
10.8%
chromista 2994
 
0.9%
eubacteria 1163
 
0.4%
fungi 322
 
0.1%
protista 42
 
< 0.1%
metazoa 24
 
< 0.1%
eukaryota 21
 
< 0.1%
bacteria 3
 
< 0.1%
protozoa 3
 
< 0.1%
2025-01-14T11:35:01.005421image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 651971
25.2%
i 579940
22.4%
n 323577
12.5%
l 323255
12.5%
m 290702
11.2%
A 287708
11.1%
t 39839
 
1.5%
e 36737
 
1.4%
P 35592
 
1.4%
r 4226
 
0.2%
Other values (15) 17873
 
0.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2263593
87.3%
Uppercase Letter 327827
 
12.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 651971
28.8%
i 579940
25.6%
n 323577
14.3%
l 323255
14.3%
m 290702
12.8%
t 39839
 
1.8%
e 36737
 
1.6%
r 4226
 
0.2%
o 3090
 
0.1%
s 3036
 
0.1%
Other values (8) 7220
 
0.3%
Uppercase Letter
ValueCountFrequency (%)
A 287708
87.8%
P 35592
 
10.9%
C 2994
 
0.9%
E 1184
 
0.4%
F 322
 
0.1%
M 24
 
< 0.1%
B 3
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Latin 2591420
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 651971
25.2%
i 579940
22.4%
n 323577
12.5%
l 323255
12.5%
m 290702
11.2%
A 287708
11.1%
t 39839
 
1.5%
e 36737
 
1.4%
P 35592
 
1.4%
r 4226
 
0.2%
Other values (15) 17873
 
0.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2591420
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 651971
25.2%
i 579940
22.4%
n 323577
12.5%
l 323255
12.5%
m 290702
11.2%
A 287708
11.1%
t 39839
 
1.5%
e 36737
 
1.4%
P 35592
 
1.4%
r 4226
 
0.2%
Other values (15) 17873
 
0.7%

phylum
Text

Missing 

Distinct62
Distinct (%)< 0.1%
Missing36740
Missing (%)10.9%
Memory size2.6 MiB
2025-01-14T11:35:01.084473image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length33
Median length25
Mean length9.088574743
Min length6

Characters and Unicode

Total characters2742023
Distinct characters46
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique6 ?
Unique (%)< 0.1%

Sample

1st rowArthropoda
2nd rowAnnelida
3rd rowAnnelida
4th rowArthropoda
5th rowArthropoda
ValueCountFrequency (%)
arthropoda 145883
48.3%
chordata 103543
34.3%
mollusca 20757
 
6.9%
annelida 11339
 
3.8%
cnidaria 3181
 
1.1%
rhodophyta 2943
 
1.0%
miozoa 2074
 
0.7%
echinodermata 1631
 
0.5%
chlorophyta 1622
 
0.5%
porifera 1250
 
0.4%
Other values (60) 8028
 
2.7%
2025-01-14T11:35:01.229540image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
o 438855
16.0%
a 414148
15.1%
r 409436
14.9%
d 269990
9.8%
h 264892
9.7%
t 263154
9.6%
A 157278
 
5.7%
p 152337
 
5.6%
C 109732
 
4.0%
l 56835
 
2.1%
Other values (36) 205366
7.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2439452
89.0%
Uppercase Letter 301700
 
11.0%
Space Separator 551
 
< 0.1%
Other Punctuation 196
 
< 0.1%
Dash Punctuation 124
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o 438855
18.0%
a 414148
17.0%
r 409436
16.8%
d 269990
11.1%
h 264892
10.9%
t 263154
10.8%
p 152337
 
6.2%
l 56835
 
2.3%
n 31280
 
1.3%
i 27003
 
1.1%
Other values (14) 111522
 
4.6%
Uppercase Letter
ValueCountFrequency (%)
A 157278
52.1%
C 109732
36.4%
M 23023
 
7.6%
R 2965
 
1.0%
P 2470
 
0.8%
E 1681
 
0.6%
N 1362
 
0.5%
B 1240
 
0.4%
O 830
 
0.3%
S 250
 
0.1%
Other values (9) 869
 
0.3%
Space Separator
ValueCountFrequency (%)
551
100.0%
Other Punctuation
ValueCountFrequency (%)
. 196
100.0%
Dash Punctuation
ValueCountFrequency (%)
124
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 2741152
> 99.9%
Common 871
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
o 438855
16.0%
a 414148
15.1%
r 409436
14.9%
d 269990
9.8%
h 264892
9.7%
t 263154
9.6%
A 157278
 
5.7%
p 152337
 
5.6%
C 109732
 
4.0%
l 56835
 
2.1%
Other values (33) 204495
7.5%
Common
ValueCountFrequency (%)
551
63.3%
. 196
 
22.5%
124
 
14.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2741899
> 99.9%
Punctuation 124
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
o 438855
16.0%
a 414148
15.1%
r 409436
14.9%
d 269990
9.8%
h 264892
9.7%
t 263154
9.6%
A 157278
 
5.7%
p 152337
 
5.6%
C 109732
 
4.0%
l 56835
 
2.1%
Other values (35) 205242
7.5%
Punctuation
ValueCountFrequency (%)
124
100.0%

class
Text

Missing 

Distinct112
Distinct (%)< 0.1%
Missing12521
Missing (%)3.7%
Memory size2.6 MiB
2025-01-14T11:35:01.334582image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length21
Median length19
Mean length9.50620553
Min length4

Characters and Unicode

Total characters3098253
Distinct characters48
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique9 ?
Unique (%)< 0.1%

Sample

1st rowInsecta
2nd rowPolychaeta
3rd rowPolychaeta
4th rowMalacostraca
5th rowPteridophyte
ValueCountFrequency (%)
insecta 113237
34.7%
actinopterygii 40747
 
12.5%
malacostraca 27947
 
8.6%
mammalia 24499
 
7.5%
amphibia 18405
 
5.6%
dicotyledonae 15871
 
4.9%
monocotyledonae 10880
 
3.3%
polychaeta 10696
 
3.3%
reptilia 9873
 
3.0%
bivalvia 9780
 
3.0%
Other values (102) 44674
 
13.7%
2025-01-14T11:35:01.506223image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 446449
14.4%
t 292487
 
9.4%
e 267717
 
8.6%
c 260251
 
8.4%
i 258346
 
8.3%
o 207477
 
6.7%
n 200830
 
6.5%
s 163672
 
5.3%
l 122646
 
4.0%
I 113253
 
3.7%
Other values (38) 765125
24.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2770286
89.4%
Uppercase Letter 325917
 
10.5%
Space Separator 690
 
< 0.1%
Close Punctuation 678
 
< 0.1%
Open Punctuation 678
 
< 0.1%
Other Punctuation 4
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 446449
16.1%
t 292487
10.6%
e 267717
9.7%
c 260251
9.4%
i 258346
9.3%
o 207477
7.5%
n 200830
7.2%
s 163672
 
5.9%
l 122646
 
4.4%
p 95959
 
3.5%
Other values (14) 454452
16.4%
Uppercase Letter
ValueCountFrequency (%)
I 113253
34.7%
A 73869
22.7%
M 65072
20.0%
D 17381
 
5.3%
P 15685
 
4.8%
R 10130
 
3.1%
B 9824
 
3.0%
G 9575
 
2.9%
F 2547
 
0.8%
C 2402
 
0.7%
Other values (10) 6179
 
1.9%
Space Separator
ValueCountFrequency (%)
690
100.0%
Close Punctuation
ValueCountFrequency (%)
) 678
100.0%
Open Punctuation
ValueCountFrequency (%)
( 678
100.0%
Other Punctuation
ValueCountFrequency (%)
. 4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 3096203
99.9%
Common 2050
 
0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 446449
14.4%
t 292487
 
9.4%
e 267717
 
8.6%
c 260251
 
8.4%
i 258346
 
8.3%
o 207477
 
6.7%
n 200830
 
6.5%
s 163672
 
5.3%
l 122646
 
4.0%
I 113253
 
3.7%
Other values (34) 763075
24.6%
Common
ValueCountFrequency (%)
690
33.7%
) 678
33.1%
( 678
33.1%
. 4
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3098253
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 446449
14.4%
t 292487
 
9.4%
e 267717
 
8.6%
c 260251
 
8.4%
i 258346
 
8.3%
o 207477
 
6.7%
n 200830
 
6.5%
s 163672
 
5.3%
l 122646
 
4.0%
I 113253
 
3.7%
Other values (38) 765125
24.7%

order
Text

Missing 

Distinct532
Distinct (%)0.2%
Missing30431
Missing (%)9.0%
Memory size2.6 MiB
2025-01-14T11:35:01.698138image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length28
Median length22
Mean length9.884892974
Min length5

Characters and Unicode

Total characters3044636
Distinct characters54
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique51 ?
Unique (%)< 0.1%

Sample

1st rowLepidoptera
2nd rowSabellida
3rd rowAmphinomida
4th rowDecapoda
5th rowPolypodiales
ValueCountFrequency (%)
lepidoptera 79773
25.9%
perciformes 26030
 
8.4%
decapoda 23842
 
7.7%
coleoptera 10156
 
3.3%
anura 10022
 
3.3%
squamata 9570
 
3.1%
hymenoptera 8500
 
2.8%
rodentia 8413
 
2.7%
caudata 8212
 
2.7%
poales 7860
 
2.6%
Other values (523) 115676
37.6%
2025-01-14T11:35:01.958734image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 429551
14.1%
a 377525
12.4%
o 262354
 
8.6%
r 255846
 
8.4%
p 249068
 
8.2%
i 218439
 
7.2%
t 178257
 
5.9%
d 158092
 
5.2%
s 110896
 
3.6%
l 90314
 
3.0%
Other values (44) 714294
23.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2735137
89.8%
Uppercase Letter 307980
 
10.1%
Other Punctuation 1416
 
< 0.1%
Space Separator 45
 
< 0.1%
Open Punctuation 29
 
< 0.1%
Close Punctuation 29
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 429551
15.7%
a 377525
13.8%
o 262354
9.6%
r 255846
9.4%
p 249068
9.1%
i 218439
8.0%
t 178257
6.5%
d 158092
 
5.8%
s 110896
 
4.1%
l 90314
 
3.3%
Other values (16) 404795
14.8%
Uppercase Letter
ValueCountFrequency (%)
L 83861
27.2%
P 47799
15.5%
C 40251
13.1%
D 32936
 
10.7%
A 25439
 
8.3%
S 22700
 
7.4%
H 15207
 
4.9%
R 9970
 
3.2%
T 5215
 
1.7%
M 3808
 
1.2%
Other values (14) 20794
 
6.8%
Other Punctuation
ValueCountFrequency (%)
. 1416
100.0%
Space Separator
ValueCountFrequency (%)
45
100.0%
Open Punctuation
ValueCountFrequency (%)
[ 29
100.0%
Close Punctuation
ValueCountFrequency (%)
] 29
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 3043117
> 99.9%
Common 1519
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 429551
14.1%
a 377525
12.4%
o 262354
 
8.6%
r 255846
 
8.4%
p 249068
 
8.2%
i 218439
 
7.2%
t 178257
 
5.9%
d 158092
 
5.2%
s 110896
 
3.6%
l 90314
 
3.0%
Other values (40) 712775
23.4%
Common
ValueCountFrequency (%)
. 1416
93.2%
45
 
3.0%
[ 29
 
1.9%
] 29
 
1.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3044636
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 429551
14.1%
a 377525
12.4%
o 262354
 
8.6%
r 255846
 
8.4%
p 249068
 
8.2%
i 218439
 
7.2%
t 178257
 
5.9%
d 158092
 
5.2%
s 110896
 
3.6%
l 90314
 
3.0%
Other values (44) 714294
23.5%

family
Text

Missing 

Distinct2911
Distinct (%)0.9%
Missing18609
Missing (%)5.5%
Memory size2.6 MiB
2025-01-14T11:35:02.120202image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length38
Median length19
Mean length10.80786103
Min length6

Characters and Unicode

Total characters3456689
Distinct characters62
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique283 ?
Unique (%)0.1%

Sample

1st rowDepressariidae
2nd rowSiboglinidae
3rd rowAmphinomidae
4th rowCambaridae
5th rowDryopteridaceae
ValueCountFrequency (%)
cambaridae 12182
 
3.8%
geometridae 12034
 
3.8%
noctuidae 7956
 
2.5%
tortricidae 7260
 
2.3%
plethodontidae 6792
 
2.1%
poaceae 6686
 
2.1%
delphinidae 5544
 
1.7%
pyralidae 5230
 
1.6%
siboglinidae 5015
 
1.6%
vesicomyidae 4924
 
1.5%
Other values (2904) 246243
77.0%
2025-01-14T11:35:02.342188image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 524407
15.2%
a 508472
14.7%
i 444074
12.8%
d 313710
 
9.1%
r 190392
 
5.5%
o 187351
 
5.4%
c 141788
 
4.1%
t 127760
 
3.7%
l 120634
 
3.5%
n 111227
 
3.2%
Other values (52) 786874
22.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 3134548
90.7%
Uppercase Letter 319831
 
9.3%
Other Punctuation 2231
 
0.1%
Space Separator 35
 
< 0.1%
Decimal Number 30
 
< 0.1%
Connector Punctuation 12
 
< 0.1%
Open Punctuation 1
 
< 0.1%
Close Punctuation 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 524407
16.7%
a 508472
16.2%
i 444074
14.2%
d 313710
10.0%
r 190392
 
6.1%
o 187351
 
6.0%
c 141788
 
4.5%
t 127760
 
4.1%
l 120634
 
3.8%
n 111227
 
3.5%
Other values (16) 464733
14.8%
Uppercase Letter
ValueCountFrequency (%)
C 51983
16.3%
P 45338
14.2%
G 29376
9.2%
S 26214
 
8.2%
A 23086
 
7.2%
T 18933
 
5.9%
M 16536
 
5.2%
D 15688
 
4.9%
N 13489
 
4.2%
L 13475
 
4.2%
Other values (16) 65713
20.5%
Decimal Number
ValueCountFrequency (%)
6 9
30.0%
0 6
20.0%
1 6
20.0%
3 6
20.0%
9 3
 
10.0%
Other Punctuation
ValueCountFrequency (%)
. 2231
100.0%
Space Separator
ValueCountFrequency (%)
35
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 12
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 3454379
99.9%
Common 2310
 
0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 524407
15.2%
a 508472
14.7%
i 444074
12.9%
d 313710
 
9.1%
r 190392
 
5.5%
o 187351
 
5.4%
c 141788
 
4.1%
t 127760
 
3.7%
l 120634
 
3.5%
n 111227
 
3.2%
Other values (42) 784564
22.7%
Common
ValueCountFrequency (%)
. 2231
96.6%
35
 
1.5%
_ 12
 
0.5%
6 9
 
0.4%
0 6
 
0.3%
1 6
 
0.3%
3 6
 
0.3%
9 3
 
0.1%
( 1
 
< 0.1%
) 1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3456689
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 524407
15.2%
a 508472
14.7%
i 444074
12.8%
d 313710
 
9.1%
r 190392
 
5.5%
o 187351
 
5.4%
c 141788
 
4.1%
t 127760
 
3.7%
l 120634
 
3.5%
n 111227
 
3.2%
Other values (52) 786874
22.8%

genus
Text

Missing 

Distinct19356
Distinct (%)6.2%
Missing25827
Missing (%)7.6%
Memory size2.6 MiB
2025-01-14T11:35:02.547254image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length21
Median length18
Mean length9.340145803
Min length2

Characters and Unicode

Total characters2919851
Distinct characters64
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2069 ?
Unique (%)0.7%

Sample

1st rowRectiostoma
2nd rowPolystichum
3rd rowMesontoplatys
4th rowBursa
5th rowAmanses
ValueCountFrequency (%)
plethodon 4675
 
1.5%
orconectes 4553
 
1.5%
indet 4240
 
1.4%
procambarus 3784
 
1.2%
unidentified 3704
 
1.2%
bathymodiolus 2599
 
0.8%
riftia 2008
 
0.6%
tursiops 1921
 
0.6%
cambarus 1854
 
0.6%
delphinus 1663
 
0.5%
Other values (19347) 281620
90.1%
2025-01-14T11:35:02.820676image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 311120
 
10.7%
o 246546
 
8.4%
i 226277
 
7.7%
e 220426
 
7.5%
s 205102
 
7.0%
r 186671
 
6.4%
t 155628
 
5.3%
n 142188
 
4.9%
l 139011
 
4.8%
u 122527
 
4.2%
Other values (54) 964355
33.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2602833
89.1%
Uppercase Letter 312570
 
10.7%
Other Punctuation 4244
 
0.1%
Decimal Number 126
 
< 0.1%
Connector Punctuation 60
 
< 0.1%
Dash Punctuation 10
 
< 0.1%
Space Separator 8
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 311120
12.0%
o 246546
 
9.5%
i 226277
 
8.7%
e 220426
 
8.5%
s 205102
 
7.9%
r 186671
 
7.2%
t 155628
 
6.0%
n 142188
 
5.5%
l 139011
 
5.3%
u 122527
 
4.7%
Other values (17) 647337
24.9%
Uppercase Letter
ValueCountFrequency (%)
P 47429
15.2%
C 36228
11.6%
A 32996
10.6%
S 23260
 
7.4%
M 18527
 
5.9%
E 17691
 
5.7%
L 16233
 
5.2%
H 15833
 
5.1%
T 13931
 
4.5%
D 13470
 
4.3%
Other values (16) 76972
24.6%
Decimal Number
ValueCountFrequency (%)
0 54
42.9%
1 27
21.4%
2 12
 
9.5%
3 12
 
9.5%
4 9
 
7.1%
6 9
 
7.1%
9 3
 
2.4%
Other Punctuation
ValueCountFrequency (%)
. 4244
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 60
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 10
100.0%
Space Separator
ValueCountFrequency (%)
8
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 2915403
99.8%
Common 4448
 
0.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 311120
 
10.7%
o 246546
 
8.5%
i 226277
 
7.8%
e 220426
 
7.6%
s 205102
 
7.0%
r 186671
 
6.4%
t 155628
 
5.3%
n 142188
 
4.9%
l 139011
 
4.8%
u 122527
 
4.2%
Other values (43) 959907
32.9%
Common
ValueCountFrequency (%)
. 4244
95.4%
_ 60
 
1.3%
0 54
 
1.2%
1 27
 
0.6%
2 12
 
0.3%
3 12
 
0.3%
- 10
 
0.2%
4 9
 
0.2%
6 9
 
0.2%
8
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2919842
> 99.9%
None 9
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 311120
 
10.7%
o 246546
 
8.4%
i 226277
 
7.7%
e 220426
 
7.5%
s 205102
 
7.0%
r 186671
 
6.4%
t 155628
 
5.3%
n 142188
 
4.9%
l 139011
 
4.8%
u 122527
 
4.2%
Other values (53) 964346
33.0%
None
ValueCountFrequency (%)
ë 9
100.0%

subgenus
Text

Missing 

Distinct293
Distinct (%)12.7%
Missing336132
Missing (%)99.3%
Memory size2.6 MiB
2025-01-14T11:35:03.003773image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length21
Median length16
Mean length10.68674177
Min length3

Characters and Unicode

Total characters24665
Distinct characters48
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique46 ?
Unique (%)2.0%

Sample

1st rowScapulicambarus
2nd rowAmara
3rd rowAnopheles
4th rowDipremna
5th rowAbax
ValueCountFrequency (%)
ortmannicus 142
 
6.2%
pyrocera 120
 
5.2%
aviticambarus 78
 
3.4%
jugicambarus 68
 
2.9%
creaserinus 64
 
2.8%
pennides 62
 
2.7%
girardiella 56
 
2.4%
scapulicambarus 47
 
2.0%
ochlerotatus 42
 
1.8%
apiocera 38
 
1.6%
Other values (283) 1591
68.9%
2025-01-14T11:35:03.252687image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 3119
12.6%
r 2160
 
8.8%
i 1917
 
7.8%
e 1855
 
7.5%
s 1832
 
7.4%
o 1443
 
5.9%
c 1332
 
5.4%
u 1293
 
5.2%
n 1221
 
5.0%
l 1176
 
4.8%
Other values (38) 7317
29.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 22357
90.6%
Uppercase Letter 2308
 
9.4%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 3119
14.0%
r 2160
9.7%
i 1917
 
8.6%
e 1855
 
8.3%
s 1832
 
8.2%
o 1443
 
6.5%
c 1332
 
6.0%
u 1293
 
5.8%
n 1221
 
5.5%
l 1176
 
5.3%
Other values (15) 5009
22.4%
Uppercase Letter
ValueCountFrequency (%)
P 481
20.8%
A 287
12.4%
C 245
10.6%
O 200
8.7%
M 184
 
8.0%
S 128
 
5.5%
H 120
 
5.2%
E 97
 
4.2%
G 91
 
3.9%
D 76
 
3.3%
Other values (13) 399
17.3%

Most occurring scripts

ValueCountFrequency (%)
Latin 24665
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 3119
12.6%
r 2160
 
8.8%
i 1917
 
7.8%
e 1855
 
7.5%
s 1832
 
7.4%
o 1443
 
5.9%
c 1332
 
5.4%
u 1293
 
5.2%
n 1221
 
5.0%
l 1176
 
4.8%
Other values (38) 7317
29.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 24665
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 3119
12.6%
r 2160
 
8.8%
i 1917
 
7.8%
e 1855
 
7.5%
s 1832
 
7.4%
o 1443
 
5.9%
c 1332
 
5.4%
u 1293
 
5.2%
n 1221
 
5.0%
l 1176
 
4.8%
Other values (38) 7317
29.7%

specificEpithet
Text

Missing 

Distinct23245
Distinct (%)7.6%
Missing33273
Missing (%)9.8%
Memory size2.6 MiB
2025-01-14T11:35:03.455726image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length23
Median length19
Mean length7.933020281
Min length2

Characters and Unicode

Total characters2420896
Distinct characters48
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3387 ?
Unique (%)1.1%

Sample

1st rowfernaldella
2nd rowsp.
3rd rowbolzi
4th rowgranularis
5th rowscopas
ValueCountFrequency (%)
sp 49913
 
16.3%
truncatus 1928
 
0.6%
cinereus 1834
 
0.6%
delphis 1661
 
0.5%
porphyriticus 816
 
0.3%
acutus 779
 
0.3%
opacum 765
 
0.3%
hoffmani 640
 
0.2%
maculatus 635
 
0.2%
nigripes 624
 
0.2%
Other values (23227) 245891
80.5%
2025-01-14T11:35:03.718280image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 298090
12.3%
i 250360
10.3%
s 244354
 
10.1%
e 177113
 
7.3%
r 154433
 
6.4%
l 153485
 
6.3%
u 141392
 
5.8%
n 141311
 
5.8%
t 129210
 
5.3%
p 116548
 
4.8%
Other values (38) 614600
25.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2369241
97.9%
Other Punctuation 50232
 
2.1%
Decimal Number 705
 
< 0.1%
Space Separator 319
 
< 0.1%
Connector Punctuation 219
 
< 0.1%
Dash Punctuation 176
 
< 0.1%
Open Punctuation 2
 
< 0.1%
Close Punctuation 2
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 298090
12.6%
i 250360
10.6%
s 244354
10.3%
e 177113
 
7.5%
r 154433
 
6.5%
l 153485
 
6.5%
u 141392
 
6.0%
n 141311
 
6.0%
t 129210
 
5.5%
p 116548
 
4.9%
Other values (16) 562945
23.8%
Decimal Number
ValueCountFrequency (%)
1 230
32.6%
0 202
28.7%
2 74
 
10.5%
3 50
 
7.1%
6 47
 
6.7%
7 42
 
6.0%
8 24
 
3.4%
5 16
 
2.3%
9 10
 
1.4%
4 10
 
1.4%
Other Punctuation
ValueCountFrequency (%)
. 49950
99.4%
" 260
 
0.5%
/ 10
 
< 0.1%
? 5
 
< 0.1%
, 4
 
< 0.1%
' 2
 
< 0.1%
# 1
 
< 0.1%
Space Separator
ValueCountFrequency (%)
319
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 219
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 176
100.0%
Open Punctuation
ValueCountFrequency (%)
( 2
100.0%
Close Punctuation
ValueCountFrequency (%)
) 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 2369241
97.9%
Common 51655
 
2.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 298090
12.6%
i 250360
10.6%
s 244354
10.3%
e 177113
 
7.5%
r 154433
 
6.5%
l 153485
 
6.5%
u 141392
 
6.0%
n 141311
 
6.0%
t 129210
 
5.5%
p 116548
 
4.9%
Other values (16) 562945
23.8%
Common
ValueCountFrequency (%)
. 49950
96.7%
319
 
0.6%
" 260
 
0.5%
1 230
 
0.4%
_ 219
 
0.4%
0 202
 
0.4%
- 176
 
0.3%
2 74
 
0.1%
3 50
 
0.1%
6 47
 
0.1%
Other values (12) 128
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2420896
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 298090
12.3%
i 250360
10.3%
s 244354
 
10.1%
e 177113
 
7.3%
r 154433
 
6.4%
l 153485
 
6.3%
u 141392
 
5.8%
n 141311
 
5.8%
t 129210
 
5.3%
p 116548
 
4.8%
Other values (38) 614600
25.4%

infraspecificEpithet
Text

Missing 

Distinct1864
Distinct (%)15.8%
Missing326664
Missing (%)96.5%
Memory size2.6 MiB
2025-01-14T11:35:03.919619image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length20
Median length17
Mean length9.024456522
Min length3

Characters and Unicode

Total characters106272
Distinct characters31
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique736 ?
Unique (%)6.2%

Sample

1st rowcinereus
2nd rowbenjamina
3rd rowmexicana
4th rowdoliatus
5th rowpallidirostris
ValueCountFrequency (%)
pennsylvanicus 615
 
5.2%
cinereus 493
 
4.2%
insignis 267
 
2.3%
melas 246
 
2.1%
talpoides 246
 
2.1%
noveboracensis 196
 
1.7%
dickeyi 167
 
1.4%
dorsalis 125
 
1.1%
cherriei 124
 
1.1%
sacarensis 107
 
0.9%
Other values (1857) 9195
78.0%
2025-01-14T11:35:04.186835image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
i 12488
11.8%
s 11660
11.0%
a 11265
10.6%
e 9324
8.8%
n 8882
 
8.4%
r 6808
 
6.4%
u 6451
 
6.1%
c 5916
 
5.6%
l 5451
 
5.1%
o 5239
 
4.9%
Other values (21) 22788
21.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 106259
> 99.9%
Space Separator 5
 
< 0.1%
Dash Punctuation 4
 
< 0.1%
Other Punctuation 2
 
< 0.1%
Open Punctuation 1
 
< 0.1%
Close Punctuation 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 12488
11.8%
s 11660
11.0%
a 11265
10.6%
e 9324
8.8%
n 8882
 
8.4%
r 6808
 
6.4%
u 6451
 
6.1%
c 5916
 
5.6%
l 5451
 
5.1%
o 5239
 
4.9%
Other values (16) 22775
21.4%
Space Separator
ValueCountFrequency (%)
5
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 4
100.0%
Other Punctuation
ValueCountFrequency (%)
. 2
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 106259
> 99.9%
Common 13
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
i 12488
11.8%
s 11660
11.0%
a 11265
10.6%
e 9324
8.8%
n 8882
 
8.4%
r 6808
 
6.4%
u 6451
 
6.1%
c 5916
 
5.6%
l 5451
 
5.1%
o 5239
 
4.9%
Other values (16) 22775
21.4%
Common
ValueCountFrequency (%)
5
38.5%
- 4
30.8%
. 2
 
15.4%
( 1
 
7.7%
) 1
 
7.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 106272
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i 12488
11.8%
s 11660
11.0%
a 11265
10.6%
e 9324
8.8%
n 8882
 
8.4%
r 6808
 
6.4%
u 6451
 
6.1%
c 5916
 
5.6%
l 5451
 
5.1%
o 5239
 
4.9%
Other values (21) 22788
21.4%

taxonRank
Text

Missing 

Distinct7
Distinct (%)0.1%
Missing326679
Missing (%)96.5%
Memory size2.6 MiB
2025-01-14T11:35:04.249599image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length10
Median length10
Mean length9.75733356
Min length3

Characters and Unicode

Total characters114756
Distinct characters18
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)< 0.1%

Sample

1st rowsubspecies
2nd rowvariety
3rd rowsubspecies
4th rowsubspecies
5th rowsubspecies
ValueCountFrequency (%)
subspecies 10856
92.3%
variety 846
 
7.2%
forma 39
 
0.3%
var 18
 
0.2%
agg 1
 
< 0.1%
fo 1
 
< 0.1%
2025-01-14T11:35:04.360516image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
s 32568
28.4%
e 22558
19.7%
i 11702
 
10.2%
b 10856
 
9.5%
p 10856
 
9.5%
c 10856
 
9.5%
u 10856
 
9.5%
a 904
 
0.8%
r 903
 
0.8%
t 846
 
0.7%
Other values (8) 1851
 
1.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 114611
99.9%
Uppercase Letter 125
 
0.1%
Other Punctuation 20
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
s 32568
28.4%
e 22558
19.7%
i 11702
 
10.2%
b 10856
 
9.5%
p 10856
 
9.5%
c 10856
 
9.5%
u 10856
 
9.5%
a 904
 
0.8%
r 903
 
0.8%
t 846
 
0.7%
Other values (6) 1706
 
1.5%
Uppercase Letter
ValueCountFrequency (%)
V 125
100.0%
Other Punctuation
ValueCountFrequency (%)
. 20
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 114736
> 99.9%
Common 20
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
s 32568
28.4%
e 22558
19.7%
i 11702
 
10.2%
b 10856
 
9.5%
p 10856
 
9.5%
c 10856
 
9.5%
u 10856
 
9.5%
a 904
 
0.8%
r 903
 
0.8%
t 846
 
0.7%
Other values (7) 1831
 
1.6%
Common
ValueCountFrequency (%)
. 20
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 114756
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
s 32568
28.4%
e 22558
19.7%
i 11702
 
10.2%
b 10856
 
9.5%
p 10856
 
9.5%
c 10856
 
9.5%
u 10856
 
9.5%
a 904
 
0.8%
r 903
 
0.8%
t 846
 
0.7%
Other values (8) 1851
 
1.6%
Distinct8732
Distinct (%)5.3%
Missing174042
Missing (%)51.4%
Memory size2.6 MiB
2025-01-14T11:35:04.546296image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length60
Median length52
Mean length9.057299967
Min length2

Characters and Unicode

Total characters1489002
Distinct characters87
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1928 ?
Unique (%)1.2%

Sample

1st row(Riley)
2nd row(Roding)
3rd rowKrylova & Moskalev
4th rowKearfott
5th row(Leconte)
ValueCountFrequency (%)
18497
 
7.9%
linnaeus 4238
 
1.8%
l 3882
 
1.7%
walker 3705
 
1.6%
barnes 3618
 
1.5%
mcdunnough 3336
 
1.4%
hobbs 3050
 
1.3%
dyar 2658
 
1.1%
busck 2449
 
1.0%
grote 2439
 
1.0%
Other values (4970) 186253
79.6%
2025-01-14T11:35:04.821044image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 122229
 
8.2%
a 106115
 
7.1%
r 99205
 
6.7%
n 88200
 
5.9%
69727
 
4.7%
o 67794
 
4.6%
l 64822
 
4.4%
i 63701
 
4.3%
s 62738
 
4.2%
( 55139
 
3.7%
Other values (77) 689332
46.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1027746
69.0%
Uppercase Letter 220001
 
14.8%
Space Separator 69727
 
4.7%
Other Punctuation 59119
 
4.0%
Open Punctuation 55139
 
3.7%
Close Punctuation 55139
 
3.7%
Dash Punctuation 2131
 
0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 122229
11.9%
a 106115
10.3%
r 99205
 
9.7%
n 88200
 
8.6%
o 67794
 
6.6%
l 64822
 
6.3%
i 63701
 
6.2%
s 62738
 
6.1%
u 50039
 
4.9%
t 47035
 
4.6%
Other values (38) 255868
24.9%
Uppercase Letter
ValueCountFrequency (%)
B 23903
10.9%
H 21389
 
9.7%
M 19083
 
8.7%
S 18205
 
8.3%
L 17414
 
7.9%
D 15350
 
7.0%
C 14994
 
6.8%
G 13194
 
6.0%
W 12869
 
5.8%
R 9509
 
4.3%
Other values (21) 54091
24.6%
Other Punctuation
ValueCountFrequency (%)
. 40323
68.2%
& 18497
31.3%
' 237
 
0.4%
, 62
 
0.1%
Space Separator
ValueCountFrequency (%)
69727
100.0%
Open Punctuation
ValueCountFrequency (%)
( 55139
100.0%
Close Punctuation
ValueCountFrequency (%)
) 55139
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 2131
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1247747
83.8%
Common 241255
 
16.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 122229
 
9.8%
a 106115
 
8.5%
r 99205
 
8.0%
n 88200
 
7.1%
o 67794
 
5.4%
l 64822
 
5.2%
i 63701
 
5.1%
s 62738
 
5.0%
u 50039
 
4.0%
t 47035
 
3.8%
Other values (69) 475869
38.1%
Common
ValueCountFrequency (%)
69727
28.9%
( 55139
22.9%
) 55139
22.9%
. 40323
16.7%
& 18497
 
7.7%
- 2131
 
0.9%
' 237
 
0.1%
, 62
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1485509
99.8%
None 3493
 
0.2%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 122229
 
8.2%
a 106115
 
7.1%
r 99205
 
6.7%
n 88200
 
5.9%
69727
 
4.7%
o 67794
 
4.6%
l 64822
 
4.4%
i 63701
 
4.3%
s 62738
 
4.2%
( 55139
 
3.7%
Other values (50) 685839
46.2%
None
ValueCountFrequency (%)
ü 1265
36.2%
é 848
24.3%
è 433
 
12.4%
ö 240
 
6.9%
ø 171
 
4.9%
ä 134
 
3.8%
á 82
 
2.3%
ê 63
 
1.8%
É 59
 
1.7%
å 38
 
1.1%
Other values (17) 160
 
4.6%